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As  metal-oxide-semiconductor  (MOS)  transistor  dimensions  are  decreased, 
channel-length  modulation,  polysilicon-gate  depletion,  and  intrinsic-capacitance 
degradation  have  increasingly  larger  impacts  on  transistor  performance.  It  is 
demonstrated  that  the  Pao-Sah  1-D  current  model  can  be  extended  to  include  the  channel- 
length  modulation  effect  by  use  of  a  two-section  model.  This  two-section  model  employs 
the  normal  long-channel  Pao-Sah  model  in  one  region  and  adds  a  variable  length 
depletion  region  in  the  other.  Three  methods  for  matching  the  boundary  between  the  two 
regions  are  presented,  with  the  best  results  coming  from  the  most  complex  method  of 
matching  the  longitudinal  fields  at  the  boundary  point. 

The  effect  of  polysilicon-gate  depletion  on  the  MOS  low-frequency 
capacitance-voltage  (LFCV)  characteristics  is  demonstrated  using  a  Fermi-Dirac-based 
model.   It  is  shown  that,  as  the  oxide  thickness  decreases,  the  effect  of  polysilicon 


depletion  becomes  increasingly  pronounced.  This  depletion,  in  conjunction  with  the 
Fermi-Dirac  carrier  distribution,  offset  the  current  gain  expected  from  thinning  the  MOS 
gate  oxide.  With  this  polysilicon-gate  LFCV  model,  it  is  shown  that  the  oxide  thickness, 
flatband  voltage,  and  gate  and  substrate  doping  concentrations  can  be  extracted  from 
experimental  capacitance  data.  Two  extraction  methods,  the  3-point  and  3-region,  are 
developed  and  are  shown  to  work  well  with  gate  oxide  thickness  of  130A  (2.7%  RMS  fit) 
and  sub  30A  (10%  RMS  fit). 

Voltage-accelerated  stress  is  performed  on  state-of-the-art  0.24  |im  effective- 
channel-length  nMOS  and  pMOS  devices  to  assess  the  impact  on  the  most  important 
intrinsic  capacitances:  Cgd  and  Cgs.  The  nMOS  devices  exhibit  a  Cgd  reduction  and  Cgs 
enhancement  with  stress  time,  whereas  the  pMOS  devices  show  negligible  change. 
Because  of  Miller  feedback,  the  nMOS  C  d  reduction  dominates  the  C  increase, 
resulting  in  an  overall  CMOS  capacitive  load  reduction.  Pre-stress  and  post-stress  ID, 
C  d,  and  Cgs  data  were  fit  using  the  BSIM3  device  model.  With  the  resulting  parameter 
sets,  a  31 -stage  ring  oscillator  was  simulated  for  three  situations:  unstressed  devices, 
stressed  devices  only  including  1D  degradation,  and  stressed  devices  including  ID,  Cgd, 
and  C  degradation.  It  is  shown  that  the  inclusion  of  the  intrinsic  capacitance 
degradation  results  in  improved  simulated  circuit  performance  because  the  capacitive 
load  reduction  offsets  the  drain  current  reduction.  This  improved  degradation 
methodology  will  result  in  looser  guardbands  and  less  reliability  redesign. 


CHAPTER  1 
INTRODUCTION 

The  last  three  decades  of  production  integrated  circuits  (IC)  have  seen  two 
orders  of  magnitude  decrease  in  device  dimensions,  from  25  |im  in  1962  to  0.25  p.m  in 
1997  [1-3].  This  continual  reduction,  fueled  by  requirements  for  higher  switching  speeds, 
lower  cost,  and  decreased  power,  has  been  sustained  by  improvements  in  lithography  and 
has  resulted  in  increased  areal  and  chip  densities  (transistors/cm2  and  transistors/chip). 
Compared  to  the  -500  transistors/chip  in  the  first  experimental  64-bit  static  random- 
access  memory  (SRAM)  in  1965,  the  ~64M  transistors/chip  64  Mbit  dynamic  random- 
access  memory  (DRAM)  of  1997  and  the  -4G  transistor/chip  4  Gbit  DRAMs  due  from 
NEC  in  2000  typify  the  strong  push  toward  increased  density. 

Increased  areal  density  implies  decreased  dimensions.  As  transistor  and 
capacitor  dimensions  decrease,  previously  negligible  effects  have  become  or  are 
becoming  increasingly  important.  Many  of  these  effects  were  assumed  avoidable  through 
constant-field  scaling  [4],  These  scaling  rules  have  been  debated,  amended,  and 
improved  [5-7]  to  account  for  noise-margin,  hot-electron,  and  extrinsic-capacitance 
considerations,  but  present  and  future  smaller  dimensions  have  necessitated  these  effects 
be  included  in  the  design  process.  Several  of  these  effects  are  discussed  below. 

As  channel  lengths  decrease,  the  thickness  of  the  space-charge  region  at  the 
drain  of  a  metal-oxide-semiconductor  (MOS)  transistor  becomes  a  significant  fraction  of 


the  total  channel  length.  As  the  drain  voltage  is  changed,  the  space-charge-region 
thickness  also  changes,  resulting  in  an  effective  channel  length  which  is  drain-voltage 
dependent,  an  effect  known  as  channel-length  modulation  (CLM).  This  problem  can  be 
tolerated  for  complementary  MOS  (CMOS)  logic  circuits,  but  needs  to  be  properly 
modeled  in  order  to  predict  the  drive  current  of  the  MOS  transistors  in  the  circuit  in  order 
to  estimate  the  speed  of  the  resulting  circuit. 

As  the  density  of  transistors  increases,  so  does  the  power  density.  This 
requires  a  reduction  in  the  operating  voltage,  since  the  active  (switching)  output  power  is 
proportional  to  the  square  of  the  operating  voltage  (Pactjve  <*=  fciock^-o^2)-  To  obtain  the 
same  performance  at  lower  voltages,  the  oxide  thickness  must  be  reduced.  Simple  MOS 
theory  predicts  the  drain  current  is  inversely  proportional  to  the  gate  oxide  thickness. 
However,  for  thin  oxides  (<  50  A),  depletion  of  the  polysilicon  gate  offsets  the  effects  of 
thinner  oxides,  resulting  in  lower  current  and  diminishing  returns  on  oxide  scaling. 
Additionally,  the  gate  voltage  cannot  be  reduced  indefinitely,  because  a  large  enough 
margin  is  needed  between  the  signal  voltage  and  ground-plane  noise  to  ensure  that  noise 
does  not  change  the  state  of  the  device. 

The  increased  density  of  transistors  also  requires  more  closely-spaced 
interconnections  between  the  transistors.  Interconnect  scaling  has  made  delays  due  to  the 
interconnection  a  limiter  in  process  speed  [8],  and  major  efforts  are  currently  underway  to 
reduce  the  interconnect  resistance  and  capacitance.  Copper  has  recently  been  introduced 
into  1998  production  by  IBM  to  reduce  the  interconnect  resistance.  Additional  efforts 
have  been  underway  to  lower  the  dielectric  constant  of  the  intermetal  dielectrics  in  order 


to  reduce  the  interconnect  capacitance.  When  the  interconnect  capacitance  is  reduced,  the 
only  remaining  capacitance  left  to  slow  the  CMOS  circuitry  is  the  intrinsic  capacitance  of 
the  transistors,  which  cannot  be  easily  reduced  and  will  become  the  predominant  speed 
limiter. 

There  are  many  other  issues  concerning  the  perpetual  reduction  in  transistor 
dimensions,  the  least  of  which  is  the  brick  wall  of  atomic  dimensions.  Clearly  transistors 
cannot  be  scaled  to  less  than  ten  or  twenty  atoms  and  still  work  in  the  traditional  sense  of 
transistors,  yet  this  dissertation  includes  data  from  a  transistor  pushing  the  atomic  limit 
with  a  gate  insulator  thickness  of  less  than  30A,  or  under  six  atomic  layers  of  silicon  and 
oxygen.  The  goal  of  this  dissertation  is  to  investigate  the  issues  described  in  the  previous 
paragraphs. 

Chapter  2  discusses  the  history  of  1 -dimensional  drain  current  models  and 
some  of  the  methods  which  have  been  implemented  to  extend  these  models  to  include  the 
CLM  effect.  The  Pao-Sah  model,  the  most  accurate  long-channel  current  models,  will  be 
extended  to  include  the  CLM  effect  using  three  different  approaches.  The  CLM  effect  (as 
demonstrated  in  the  new  models)  will  be  discussed,  as  well  as  the  pros  and  cons  of  the 
approaches. 

Chapter  3  tackles  the  polysilicon  depletion  problem  by  deriving  the  Fermi- 
Dirac-statistics-based  polysilicon-gate  MOS  low-frequency  capacitance  model,  including 
the  effect  of  dopant  impurity  deionziation.  By  comparing  this  with  the  traditional  metal- 
gate  model,  the  effect  of  polysilicon  gate  depletion  will  be  shown  to  increase  significantly 
as  the  oxide  thins.  With  this  model,  a  parameter  extraction  methodology  is  presented 


which  allows  the  extraction  of  substrate  and  gate  doping  concentrations  as  well  as  the 
oxide  thickness  and  flatband  voltage  from  experimental  LFCV  data.  Two  methodologies 
will  be  presented  and  compared,  and  data  from  thick  (130A)  and  thin  (<  30A)  gate-oxide 
devices  will  be  used.  Additional  oxide  thickness  issues,  such  as  quantum  effects,  are  also 
discussed. 

Chapter  4  considers  the  intrinsic  capacitances,  in  particular,  those  most 
important  in  modern  complementary  MOS  (CMOS)  circuits:  Cgd  and  Cgs.  Compared  to 
the  drain  current,  which  is  also  an  intrinsic  property  of  a  MOS  transistor,  intrinsic 
capacitances  have  been  relatively  ignored  because  of  measurement  difficulty  and 
relatively  small  impact  compared  to  extrinsic  capacitances.  However,  as  processing  and 
dielectric  technology  advances,  the  primary  remaining  capacitive  load  in  CMOS  circuits 
will  be  the  intrinsic  capacitances.  The  chapter  presents  an  experimental  investigation 
how  these  capacitances  change  with  hot-carrier  stress  and,  after  modeling  the  stress- 
induced  changes  in  the  intrinsic  capacitances,  shows  that  part  of  the  drain  current 
degradation  is  offset  by  the  intrinsic  capacitance  reduction,  resulting  in  a  slower 
degradation  of  overall  circuit  performance. 


CHAPTER  2 
EXTENDING  THE  ONE-DIMENSIONAL  CURRENT  MODEL 

Introduction 

The  simplest  1-D  model  is  of  crucial  importance  for  applications  in 
semiconductor  physics.  Although  3-D  models  will  best  match  experimental  data  because 
of  both  inclusion  of  real  effects  and  simply  additional  variables,  they  may  be  intractable 
as  compact  device  models,  where  computational  efficiency  is  critical.  Conversely,  these 
3-D  models  are  often  validated  by  demonstrating  their  reduction  to  the  rigorous  1-D 
forms  for  non-critical  (wide  and  long  channels  with  thick  oxides)  geometries.  For  back- 
of-the-envelope  calculations,  knowledge  of  the  basic  physics  embodied  in  a  good  1-D 
model  is  exceedingly  useful. 

The  required  accuracy  of  a  model  is  largely  determined  by  the  application. 
For  predicting  the  drive  current,  such  as  might  be  required  for  a  discrete-transistor 
specification  sheet,  a  model  need  not  worry  about  the  linear  or  subthreshold  regions  of 
operation.  Similarly,  if  modeling  only  the  operating  range  (0  to  power  supply  voltage), 
then  the  accumulation  region  of  applied  gate  voltages  can  be  ignored  in  the  model.  There 
are  cases,  particularly  when  attempting  to  predict  the  performance  of  new  technology, 
where  3-D  full-range  MOSFET  models  are  necessary,  but  they  are  a  relative  minority 
compared  to  the  wide  array  of  applications  for  1-D  models. 


This  chapter  contains  a  brief  history  of  one-dimensional  (1-D)  approaches  to 
drain  current  models,  including  calculations  and  comparisons,  followed  by  a  new  two- 
section  model  using  the  1-D  Pao-Sah  long-channel  IV  model  in  conjunction  with  a 
variable-length  depletion  region.  The  goal  is  to  extend  the  1-D  long-channel  model  to 
short-channel  use. 

Background 

In  1926  Lilienfeld  [9]  submitted  the  patent  for  the  first  MOSFET  device,  an 
Al/Al203/Cu2S  transistor.  Thirty-two  years  later  in  1960,  Kahng  and  Atalla  [10] 
fabricated  the  first  silicon  MOS  transistor.  A  year  later,  the  first  MOST  current-voltage 
(IV)  papers  were  published  internally  at  AT&T  Bell  Labs  in  1961  by  Kahng  [11]  and 
later  at  Stanford  by  Ihantola  [12].  These  were  followed  in  1964  by  more  complete  (and 
widely  released)  1-D  theories  by  Sah  [13]  and  Ihantola  and  Moll  [14].  A  comprehensive 
history  of  MOS  developments  was  reviewed  by  Sah  [1].  In  the  subsequent  years  since 
the  first  MOST  model,  hundreds  of  papers  and  theses  have  been  written  about  the 
modeling  of  various  aspects  of  MOS  transistors.  This  chapter  will  discuss  the  prevailing 
1-D  models  including  Pao-Sah,  bulk-charge,  charge-sheet,  and  the  many  two-section 
models. 

Long-Channel  Theory 

"Long  channel"  is  a  term  used  to  specify  that  short-channel  effects  can  be 
neglected  when  modeling  MOSTs,  and  the  predominant  short-channel  effect  is 
encroachment  of  the  drain  depletion  region  into  the  channel.  The  depletion  region  exists 


due  to  the  reverse-biased  p/n  junction  between  the  substrate  and  the  drain,  and  has 
nothing  to  do  with  the  actual  channel  length.  For  long-channel  devices,  however,  the 
amount  of  encroachment  relative  to  the  channel  length  is  small,  so  the  effective  channel 
length  is  essentially  constant  (equal  to  the  drawn  gate  length).  For  short  channels, 
however,  the  effective  channel  length  can  be  significantly  reduced  by  the  encroachment. 
Another  short-channel  effect  neglected  in  long-channel  theory  is  drain-induced  barrier 
lowering  [15],  where  the  source  barrier  is  lowered  by  the  applied  drain  voltage. 

Pao-Sah  Model 

The  most  accurate  long-channel  theory  was  published  by  Pao  and  Sah  (PS) 
[16].  The  PS  model  is  the  only  one  which  correctly  accounted  for  drift  and  diffusion.  The 
PS  theory,  to  be  discussed  below,  contains  a  double  integral,  but  can  be  reduced  to  a  more 
efficient  form  containing  only  single  integrals  [17,  18].  Although  cumbersome  to 
calculate,  the  PS  double  integral  is  extremely  didactic  and  is  a  useful  starting  point  for 
showing  the  approximations  used  to  derive  other  long-channel  IV  models.  The  total 
current  flowing  in  the  channel  is  given  by  the  integral 

[xi 
ID  =        J(x,y)Z   dx,  (2.1) 

0 

where 

J(x,y)    =  JN  +  Jp  =  JN  =  q/JnNEy  +  qDnVN  =  qDnNV£  .  (2.2) 

JN  and  JP  are  the  electron  and  hole  current  densities,  respectively,  and  it  is  assumed  that 

the  current  is  dominated  by  electrons  in  an  n-channel  device  in  (2.2).  The  electron  charge 

is  q,  p.n  and  Dn  are  the  electron  mobility  and  diffusion  respectively,  and  VN  is  the 


X 


gradient  of  the  electron  concentration.  The  electron  quasi-Fermi  level,  £,,  is  measured 
relative  the  bulk  Fermi  level  and  normalized  to  kT/q. 

If  d^/dx  is  assumed  negligible  (which  is  a  fundamental  assumption  in  the 
long-channel  approximation  and  should  be  valid  to  a  depth  on  the  order  of  the  drain 
junction  depth),  then  ID  can  be  found  from  summing  up  all  the  current  from  the  surface 
down  to  some  depth  Xj  below  which  the  additional  contribution  is  negligible: 


I„    =    qDnZ(df/dy) 


N(x)dx 


This  can  be  transformed  from  physical  space  in  the  y  direction  to  potential  space  as 
follows: 


L 


ID   dy   =    qDnZ 


d{ 


N(x)    dx 


(2.3) 


where  UD=qVDS/(kT)  is  the  normalized  drain  voltage  at  y=L  and  the  lower  limit  0  is  the 
grounded  source  voltage  at  y=0.  A  similar  transform  in  the  x  direction  yields: 


qnn- 

L 


d£ 


Us         N(U) 


dU 


(2.4) 


'UF    (-dU/dx) 

where  (dU/dx)  is  the  x-component  of  the  electric  field,  which  can  easily  derived  from 
integrating  Poisson's  equation  by  quadrature  and  is  given  below.  The  Boltzmann 
approximation  to  the  carrier  concentration  is  being  used  and  the  impurities  are  assumed 
completely  ionized,  but  the  Fermi-Dirac  and  deionized  form  can  be  used.  Us  is  the 
normalized  surface  potential  (where  surface  is  at  x=0),  the  total  amount  of  surface  band 
bending  relative  to  the  intrinsic  Fermi  level.  It  is  a  function  of  both  the  gate  voltage  and 
the  drain  voltage.   UF  is  the  normalized  bulk  Fermi  level,  below  which  the  current 


contribution  is  assumed  negligible,  and  is  analogous  to  the  to  physical  point  x=Xj  in 
(2.3).  The  derivative  (dU/dx)  is  found  from 

(-du/dx)    =   F(U,£,UF)/LD  (2.5) 

where 

F(U,£,UF)  =  [exp(U-£-UF)    +    exp(UF-U)    +    (U-l)  exp(UF) 

-    (U+exp(-£)  )exp(-UF)]1/2  (2.6) 

After  applying  Einstein's  relationship,  Dn/p.n  =  kT/q,  (2.4)  becomes 


kT 


Z      W. 


2L 


Us  exp(U-£-UF) 


-dud£  (2.7) 


JUF    F(U,  f  ,UF) 

The  surface  potential,  Us(4),  is  needed  in  (2.7).  The  relationship  between  the  surface 
potential  and  the  gate  voltage  can  be  found  by  applying  Gauss's  Law  at  the 
semiconductor/insulator  interface.  The  resulting  equation,  given  below,  can  be  solved 
iteratively  for  Us  for  a  given  \. 

UG   =   Us    +    sign(Us)  yF(Us,£,UF)  (2.8) 

where  UG  is  the  normalized  gate  voltage,  q(VGS  -  VFB)/kT;  y  is  es/(LD.C0);  LD  is  the  Debye 
length  (V[eskT/(2nj)]/q);  and  F(US£,UF)  is  given  by  (2.6). 

Equation  2.7  is  the  traditional  form  of  the  PS  integral,  often  called  the  Pao-Sah 
double  integral.  A  more  computationally  friendly  and  accurate  single-integral  form  [17] 
was  used  for  the  calculations  in  this  dissertation.  The  mobility  in  (2.7)  need  not  be  taken 
out  of  the  integrals.  Instead,  it  can  be  a  function  of  the  vertical  and  lateral  fields  and  moved 
inside  of  the  integrals.  In  this  chapter  the  mobility  will  be  assumed  independent  of  field. 
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A  good  way  to  understand  Eq.  2.7  is  to  consider  the  three-dimensional  band 
structure  of  a  MOST  under  gate  and  drain  bias,  as  shown  in  Figures  2.1-2.4,  based  on  the 
original  Pao-Sah  paper  [16].  Figure  2.1  shows  an  idealized  n-channel  MOST.  Figure  2.2 
shows  the  corresponding  energy  band  diagram  with  no  applied  terminal  voltages  except 
VGS=VFB.  From  the  position  of  the  Fermi  level  it  is  easily  verified  that  the  source  and  drain 
are  n-type  and  the  substrate  is  p-type  (n-channel  device).  Electrons  in  the  source  and  drain 
see  a  potential  barrier  toward  the  channel. 

Application  of  a  positive  voltage  to  the  gate  lowers  the  barrier  near  the  surface, 
as  shown  in  Figure  2.3.  The  applied  gate  voltage  pulls  electrons  toward  the  surface  (and 
pushes  holes  away  from  the  surface),  as  can  be  seen  from  the  position  of  the  Fermi-level 
relative  to  the  band  edges.  Farther  into  the  substrate  (away  from  the  gate/substrate 
interface)  there  is  no  bending  from  the  gate  potential,  so  the  region  is  identical  to  the 
unbiased  case  (Figure  2.2)  and  considered  quasi-neutral. 

Applying  a  voltage  to  the  drain  (VDS  <  VDSsat)  splits  the  Fermi  level  into  quasi- 
Fermi  levels  (FN  for  electrons  and  Fp  for  holes),  as  shown  in  Figure  2.4.  One  can  imagine 
an  electron  in  the  conduction  band  surmounting  the  source  barrier  and  then  falling  down  the 
potential  'cliff  until  reaching  the  drain.  This  'free  fall'  is  where  the  electron  gains  energy 
while  moving  across  the  channel.  If  the  electron  is  not  scattered  while  moving  across  the 
channel  (losing  energy  to  the  lattice  via  phonons),  it  becomes  increasingly  energetic  as  it 
approaches  the  drain  and  may  become  'hot'  enough  to  produce  an  e-h  pair  via  impact,  the 
resulting  hole  may  generate  interface  traps  via  dehydrogenation  of  Si-H  bonds  near  the 
Si/Si02  interface  [19].  This  is  only  one  of  several  mechanisms  for  interface  trap  generation. 
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Fig.  2. 1  Simplified  view  of  two-dimensional  MOS  device. 


Source 


Fig.  2.2  Schematic  2-D  energy  band  diagram  of  simple  MOS  device  with  source 

and  drain  grounded  and  VGS=VFB.  Adapted  from  Pao  and  Sah  [16]. 
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Fig.  2.3  Schematic  2-D  energy  band  diagram  of  simple  MOS  device  with  VGS  > 

VFB,  drain  and  source  grounded.  Adapted  from  Pao  and  Sah  [16]. 
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Source 


Fig.  2.4  Schematic  2-D  energy  band  diagram  of  simple  MOS  device  with  VGS  > 

VFB,  0  <  VDS  <  VDSsa(,  and  source  grounded.  Adapted  from  Pao  and  Sah 
[16]. 
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Figure  2.5  shows  the  result  of  applying  a  drain  voltage  in  excess  of  VDSsat.  As 
will  be  discussed  in  the  two-section  model  section  later,  the  drain  depletion  region  becomes 
increasingly  longer  as  the  reverse-biased  drain  voltage  increases.  For  this  long-channel 
section  of  the  dissertation,  however,  the  change  in  length,  AL,  is  assumed  much  less  than  the 
channel  length  L.  The  voltage  drop  across  this  thin  depletion  region  often  results  in  large 
fields  which  can  greatly  accelerate  carriers,  causing  the  interface  damage  mentioned  above. 

Now  that  the  effect  of  applied  biases  on  the  2-D  structure  of  the  band  has  been 
discussed,  it  is  easy  to  see  the  basis  of  the  integral  limits  in  Equation  2.7.  The  inner  integral 
is  integrating  from  the  surface  into  the  bulk  (from  Us  to  UF),  which  is  a  cross  section  of  the 
channel  as  shown  in  Figure  2.6.  The  outer  integral  is  integrating  from  drain  to  the  source 
(UD  to  0,  source  is  grounded)  along  the  channel.  Thus,  the  double  integral  is  summing  up 
all  the  current  contribution  in  the  channel,  exactly  as  would  be  expected.  Since  Us  is  a 
function  of  the  drain  voltage  (or  the  channel  potential),  the  order  of  the  double  integration  is 
not  trivially  reversible. 

Bulk-Charge  Model 

The  first  group  of  Ip  models,  in  order  of  complexity,  were  by  Sah  [13],  Ihantola 
and  Moll  [14],  and  Sah  and  Pao  [20].  These  are  all  bulk  charge  models,  taking  increasingly 
more  into  account.  As  the  name  suggests,  the  bulk  charge  model  takes  the  depleted  region 
under  the  channel  (in  the  bulk)  into  account.  It  assumes  drift  is  the  major  component  and  so 
neglects  the  diffusion  component.  This  greatly  simplifies  the  problem  and  reduces  (2.2)  to 
J(x,y)    =   JN   +    Jp    =   JN   =    q/JnNEy   =   q/JnN(x)  (dV/dy)  (2.9) 
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Fig.  2.5  Schematic  2-D  energy  band  diagram  of  simple  MOS  device  with  VGS  >  0, 

VDS  >  VDSsat,  and  source  grounded.  Adapted  from  Pao  and  Sah  [16]. 


16 


Source 


Drain 


E  versus  X  near  source 


E  versus  X  near  drain 


/ 

f " 

Ef 

/ 

qvGS 

r 

G* 

qV(y)=FP-FN=«y)kT 


Fig.  2.6  Schematic  2-D  energy  band  diagram  of  simple  MOS  device  with  VGS  >  0 

and  VDS  <  VDSsat.  Cross-sections  show  the  1-D  energy-band  diagrams 
near  the  source  and  drain.  GsGate  electrode,  X=Substrate  electrode. 
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X.: 


qN(x)dx  (2.10) 

0 


I„  =  q/inZ(dV/dy 

ID    =    -/inZ(dV/dy)QN  (2.11) 


where 

QN   =   -c0<vG  -  V  -Vs0)    +    (2qPxxes)1/2[Vso   +  V]1/2  (2.12) 

C0  is  the  oxide  capacitance  per  unit  area,  Vq  is  VGS  -  VFB,  Vs0  is  the  surface  potential  at 
the  source,  and  V  is  the  channel  potential  (=VDS  at  the  unsaturated  drain).  Pxx  is  the 
substrate  impurity  concentration  and  es  is  the  dielectric  constant  of  silicon.  The  first  term  is 
the  charge  accumulated  in  the  channel  and  the  second  term  is  the  uncompensated  charge  in 
the  depletion  region  beneath  the  channel  (i.e.  bulk  charge).  Integrating  (2.12)  along  the 
channel  gives: 

ID   =   /ln(Z/L)C0{     (VG   -   VS0)VDS   -   V2DS/2    -  (2.13) 

(1/C0)  (2/3)  (2qPxxes)1/2[     (Vs0    +   VDS)3/2    -    <VS0)3/2]} 
This  form  is  slightly  different  than  the  Sah-Pao  and  Ihantola-Moll  forms  because  it  is  not 
assumed  that  VS0=2VF,  where  VF  is  the  Fermi  voltage.  A  more  exact  form  [17]  is: 

ID   =    /Jn(Z/L)C0{    VG(VSL   -   Vs0)    -    (1/2)  (V§t,   -   V20)  (2.14) 

-     (2/3)  (1/C0)  (2qPxxes)1/2[(VSL)3/2    -     (Vs0)3/2]} 
where  VSL  is  the  surface  potential  at  the  drain.  This  differs  from  (2. 13)  in  that  the  surface 
potential  at  the  drain  is  calculated  instead  of  assumed  to  be  VSL=VS0  +  VDS.  When  the 
drain  current  approaches  or  exceeds  saturation  (VDS  >  VDSSal),  VSL*VS0  +  VDS. 
Additionally,  in  subthreshold,  VSL  is  typically  closer  to  Vso  than  Vso  +  VDS  [21].  As  will 
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be  shown  later,  the  bulk  charge  formula  should  never  be  used  for  subthreshold  calculations 
since  it  neglects  diffusion,  which  is  the  primary  subthreshold  current  contribution. 

The  bulk  charge  form,  compared  to  PS,  is  considerably  easier  to  calculate, 
particularly  when  using  (2.13)  with  Vso  =  2VF,  but  is  invalid  in  subthreshold.  Equation 
2. 13  is  also  invalid  in  saturation  as  written,  but  that  can  be  fixed  somewhat  by  calculating 
the  saturation  voltage  VDSsat  and  fixing  the  current  for  all  drain  voltages  greater  than  VDSsal. 
This  will  make  the  first  derivative  (drain  conductance)  non-continuous  at  VDS=VDSsat.  All 
saturation  problems  are  solved  in  (2.14),  where  the  calculation  of  VSL  negates  these 
problems.  Iterative  calculation  of  VSL  is  time  consuming,  particularly  compared  to 
assuming  a  constant,  or  pinned,  surface  potential  value. 

Charge-Sheet  Model 

While  most  of  the  interest  centered  on  super-threshold  operation  of  the  MOST, 
some  people  became  concerned  with  the  lack  of  accurate  modeling  for  subthreshold 
operation.  Barron  [21]  and  Van  Overstaeten  et  al.  [22]  developed  subthreshold  formulae 
based  on  simplifications  of  the  Pao-Sah  integral,  with  results  applicable  only  to  the 
subthreshold  region. 

Six  years  later.  Brews  [23]  made  a  critical  approximation  which  would  allow  both 
drift  and  diffusion  components  to  be  introduced  simultaneously  without  the  need  for  a 
double  (or  single)  integral.  When  he  proposed  his  "charge-sheet  model,"  he  introduced  the 
following  simplification: 

I    =   qZ/inN(y)  (d£/dy) 

d£/dy=d0s/dy  -   1/0  dln(n)/dy  (2.15) 
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This  approximation  for  d^/dy  was  justified  "based  upon  its  success  in  producing  'correct'  I- 
V  curves,"  although  he  added  a  footnote  relating  the  formula  to  electrochemical  potential. 
This  wide-open  statement  resulted  in  several  subsequent  'proofs'  which  derived  the  same 
formula  [17,  24,  25].  Essentially,  though,  he  decoupled  the  drift  and  diffusion  components 
from  the  tight  interdependency  seen  in  the  Pao-Sah  form  to  the  simple  form  of  (2.15). 

Through  a  similar  derivation  to  bulk-charge,  ID  is  given  by 
I  =  /l„(Z/y)  (1/0){CO(1/|3+VG)  (Vs(y)    -  Vs0)    -    (l/2)C0(V|(y)    -  V20)       (2.16) 
-     (3/2)     (2qPxx£s)1/2[     (/JVs(y)     -    1)3/2    -    (0VEO    -    1)3/2] 
+  (2qPxxEs)1/2[     (/JVs(y)     -    1)1/2    -    (pVs0    -    U1/2]> 

Eq.  2.16  reduces  to  bulk-charge  form  of  Eq.  2.14  if  VG,  Vs(y),  Vso  »  1/p  and  the  square 
root  terms  are  negligible.  Unlike  bulk-charge,  this  formula  is  valid  in  subthreshold  and  does 
not  require  a  calculation  of  VDSsat  (assuming  VSL  and  Vso  are  calculated  iteratively).  Like 
bulk-charge,  this  is  much  easier  to  calculate  than  a  double,  or  even  single,  integral. 

Brews,  and  many  subsequent  authors,  validated  the  charge-sheet  model  by 
comparing  it  to  the  results  of  the  Pao-Sah  formula.  It  has  been  shown  to  be  an  excellent 
approximation,  as  will  be  discussed  in  the  next  section. 

Comparison  of  Long-Channel  Models 

The  Pao-Sah  double-integral  model  has  been  heralded  as  the  best  long-channel 
model.  Brews  [23]  went  so  far  to  say  that  "Comparison  of  the  charge-sheet  model  with  the 
Pao-Sah  model  has  the  force  of  comparison  with  experiment,  since  the  Pao-Sah  model  is 
known  to  work  well  for  long  channel  devices."  Schrimpf  et  al.  [26]  agreed,  saying  Pao  and 
Sah  "produced  a  quantitative  model  so  accurate  that  it  is  the  standard  by  which  other  models 
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are  judged."  Since  bulk-charge  and  charge-sheet  are  both  approximations  to  Pao-Sah,  it 
makes  sense  to  compare  them  with  Pao-Sah  to  see  how  accurate  they  are,  taking  into 
account  that  all  the  models  are  only  valid  for  long-channel  devices. 

Figure  2.7  shows  all  three  methods  simulated  for  Tox=500  A,  T=296  K,  Pxx=1015 
cm"3,  W/L=  1 0.  These  are  typical  parameters  for  LSI  devices  of  the  1 970s,  and  were  chosen 
to  match  the  data  used  in  Pierret  and  Sheilds  [17].  As  can  be  seen,  the  bulk-charge  and 
charge-sheet  models  underestimate  the  current.  Figure  2.8  shows  the  percentage  error  for 
each  model  at  the  gate  voltages  shown  in  Fig.  2.7,  demonstrating  that  the  charge-sheet 
model  maintains  an  error  of  less  than  2.6%  for  all  gate  voltages,  while  the  bulk  charge 
model  ranges  from  2.5%  for  VGS=5.0V  to  8.4%  for  VGS=2.0  V.  This  suggests  that  the 
much  simpler  charge  sheet  can  be  used  in  place  of  Pao-Sah  incurring  only  about  2.5%  error 
at  low  voltages. 

Figure  2.9  shows  the  subthreshold  region  for  the  same  device  with  VDS=0.1  V. 
Clearly  demonstrated  in  this  figure  is  both  the  glaring  inadequacy  of  the  bulk-charge  model 
for  subthreshold  modeling  and  the  remarkable  accuracy  of  the  simple  charge-sheet  model. 
However,  recall  that  this  is  charge-sheet  with  iteratively  calculated  surface  potentials,  so  the 
numerical  solution  is  not  entirely  trivial. 

Two-Section  Models 

Up  until  now,  only  long-channel  ID  equations  have  been  considered.  For  short- 
channel  devices  (<1  u.m),  the  most  prominant  non-modeled  effect  on  the  drain  current  is 
finite  drain  conductance  beyond  saturation.  The  primary  cause  of  this  non-zero  drain- 
conductance  (gD)  is  channel  shortening  from  the  drain  space-charge  region  (SCR) 
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Fig.  2.7  ID  versus  VDS  for  different  VGS  values  for  the  three  1-D  ID  models. 

Parameters  are  Tox=500  A,  T=296  K,  Pxx=1015  cm'3,  W/L=10,  which 
were  used  to  match  data  in  Pierret  and  Shields  [17]. 
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Fig.  2.8  Percentage  error  in  ID  for  charge  sheet  and  bulk  charge  relative  to  Pao- 

Sah  versus  VDS,  from  Fig.  2.7.  Plots  are  VGS  =  5,  4,  3,  and  2  V,  with 
higher  errors  for  lower  voltages. 
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Fig.  2.9  ID  versus  VGS  for  Pao-Sah,  charge  sheet,  and  bulk  charge  using  same 

data  as  Fig.  2.7  with  VDS=0. 1  V.  Clearly  bulk  charge  is  not  useful  in 
subthreshold,  whereas  charge-sheet  is  almost  coincident  with  Pao-Sah. 
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encroaching  into  the  channel.  This  effect  is  often  called  channel-length  modulation  since 
the  drain  voltage  modulates  the  effective  channel  length. 

The  most  logical  approach  is  to  divide  the  region  between  the  source  and  drain  into 
two  sections:  a  'source  side'  and  a  'drain  side'.  The  'source  side'  may  contain  any 
appropriate  long-channel  IV  model,  such  as  Pao-Sah,  charge  sheet,  or  bulk  charge.  The 
'drain  region'  is  the  depletion  region,  and  can  be  modeled  with  or  without  mobile  charge,  2- 
D  effects,  mobility  differences,  etc.  The  location  of  the  boundary  between  these  regions, 
and  the  voltages  and  fields  at  this  boundary,  are  what  make  this  a  challenging  problem. 
Figure  2.10  shows  a  diagram  of  a  MOS  transistor  divided  into  two  sections. 

There  are  essentially  three  things  which  differ  among  approaches  to  two-section 
theory:  the  the  source-side  IV  model,  the  drain-side  space-charge  region  (SCR)  model,  and 
the  boundary  conditions. 
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Fig.  2.10  Schematic  diagram  of  two-section  MOST  for  1-D  modeling.  SCR 

means  'Space-Charge  Region'  and  Leff  refers  to  the  effective  channel 
length. 
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The  IV  model  can  be  one  of  the  many  already  discussed.  The  SCR  model  can  be 
assumed  fully  depleted,  take  mobile  charge  into  account,  or  be  a  complete  2-  or  3-D  model. 
The  boundary  conditions  are  the  most  difficult  and  varied  among  approaches.  Essentially, 
the  potentials,  fields,  and  charge  at  the  boundary  between  the  two  regions  need  to  be 
matched. 

The  simplest  two-section  MOST  model  was  introduced  in  1965  by  Reddi  and  Sah 
[27].  They  used  a  source-side  bulk-charge  model  for  the  current  and  a  fully-depleted  drain- 
side  depletion  model.  From  the  first  derivative  of  the  bulk-charge  model  (Eq.  2.13  with 
VS=2VF),  Reddi-Sah  (and  others)  calculated  the  drain  voltage  where,  for  a  constant  gate 
voltage,  the  drain  conductance  drops  to  zero  (VDSsat).  They  then  assumed  all  voltage  in 
excess  of  VDSsat  falls  across  the  SCR  to  form  the  drain  region  of  the  two-section  model. 

By  assuming  complete  depletion  (no  mobile  charge)  and  no  y-field  at  the  boundary, 
the  length  can  be  calculated  from  simple  p/n  junction  theory  as: 

AL    =     [2£S(VDS    -    VDSsat    +    VbiJ/fqP^)]172  (2.17) 

where  Vbi  =  (kTq)ln(NdrainNsubstrate/nJ)  from  standard  abrupt-junction  p/n  theory.  Replacing 
L  by  Leff=L-AL  and  Vso  with  2VF  in  (2.13)  yields  the  Reddi-Sah  two-section  current. 

The  simplicity  of  this  formula  is  extremely  attractive,  but  the  solution  is  dependent 
on  the  ID  model.  Specifically,  it  assumes  that  a  VDSsat  voltage  can  be  found.  If  using  Pao- 
Sah  or  charge-sheet,  the  surface  potential  is  not  constant  and  a  VDSsat  point  does  not  actually 
exist.  Even  if  VDSsat  is  found  from  extrapolation,  the  first  derivatives  of  the  drain  current 
will  be  non-smooth  at  the  point  where  the  drain  current  switches  from  one  model  (Pao-Sah, 
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charge-sheet,  bulk-charge)  to  another  (constant  ID),  although  this  can  be  fixed  with  various 
smoothing  transitional  functions. 

Four  years  after  Reddi  and  Sah's  paper,  Chiu  and  Sah  [28]  came  out  with  a  two- 
section  model  which  solved  Laplace  equation  in  the  oxide  layer  and  matched  values  in  four 
regions  (source,  drain,  oxide,  and  bulk).  The  drain  region  was  solved  as  a  2-D,  fully- 
depleted  region,  and  the  solution  required  seven  matching  parameters.  The  complexity  of 
the  solution  relinquished  this  model  to  an  almost  constant  reference  as  "too  complex." 

The  following  year  (1969)  Frohman-Bentchkowsky  and  Grove  [29]  developed  a 
two-section  model  using  bulk-charge  model  in  the  source  region  and  an  empirical  model  for 
the  drain  section.  This  simple  model  essentially  added  two  additional  fringe  field 
contributions  to  the  Reddi-Sah  model  and  added  two  empirical  variables  to  fit  the  data. 

Merckel,  Borel,  and  Cupcea  [30]  added  mobile  charge  to  the  drain  region 
empirically  by  writing  Poisson's  equation  in  the  drain  region  as 

d2V/dy2   =   q/£s(Pxx  +   IDS)/(qZo)  (2.18) 

where  a  is  essentially  a  fitting  parameter  related  to  the  junction  depth.  This  mobile  charge 
is  akin  to  the  Kirk  effect  in  bipolar  devices,  just  as  the  drain-depletion  encroachment  is 
analogous  to  the  Early  effect.  Using  an  iteratively  determined  VDSsat,  they  were  able  to 
calculate  the  drain  depletion  width.  Popa  [31]  devised  a  similar  model  and  extended  the 
drain  depletion  region  to  be  of  three  types  depending  on  the  injected  current.  In  both 
mobile-charge  cases,  fitting  parameters  were  introduced  either  through  (2. 18)  or  mobility. 
Both  used  variations  of  the  simple  bulk  charge  model  for  the  source  side. 
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After  Brews  developed  the  charge-sheet  model,  all  subsequent  two-section  models 
employed  the  charge-sheet  model.  Guebels  and  Van  de  Wiele  [32]  developed  a  three- 
section  model  to  account  for  the  x-field  reversal  near  the  drain.  They  employ  the  same  trick 
as  the  previous  papers  by  fitting  the  a  in  (2.18),  using  VDSsal  (or  IDsat)  and  adding  some 
empiricism  to  their  field  calculations. 

Beyond  Two-Section  Models 

The  charge-sheet  model  (and  Pao-Sah,  as  will  be  shown)  does  not  lend  itself  well 
to  analytical  two-section  models  due  to  the  greater  complexity  of  the  drain  current  model 
relative  to  bulk  charge.  As  noted  above,  fitting  parameters  and  empirical  formulae  were 
required  to  be  introduced  to  satisfy  some  of  the  boundary  conditions. 

The  newer  compact  models,  such  as  BSIM  [33,34]  and  Siemen's  [35-37]  model, 
are  based  loosely  on  one-section  bulk-charge  and  charge-sheet  models,  respectively, 
sometimes  dividing  the  model  into  different  sections  based  on  operation  (separate 
subthreshold  and  superthreshold  formulae).  They  both  model  short-channel  effects  by 
adding  semi-empirical  additions  to  the  threshold  voltage,  which  makes  for  a  considerably 
faster  calculation  speed  at  the  expense  of  a  less-physical  model. 

Examples  Using  Pao-Sah 

The  goal  was  to  develop  a  two-section  model  which  employs  the  Pao-Sah  integral 
as  the  source-side  current  formula.  The  following  is  a  description  of  the  methodology  and 
results  of  the  exercise. 
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Field-Matching  Method 

The  Pao-Sah  current  has  already  been  discussed,  as  have  been  models  for  the 
depletion  region.  Let  us  consider  the  matching  boundary  of  the  two  section  model  to  occur 
at  the  point  y=yM  where  the  channel  voltage  is  VM  with  a  lateral  field  EM  and  electric  field 
gradient  d2Us/dy2=dEM/dy. 

A  simple  way  to  look  at  this  problem  is  from  the  Poisson's  equation  in  the  drain 
region  while  considering  the  boundary  conditions.  Within  the  drain  region,  which  extends 
from  y=yM  to  y=L,  the  boundary  conditions  are  (see  Fig  2.10): 

V(L)=VDS 

v(yM)=vM 

dV(yM)/dy=EM  (field  at  the  match  point) 

d2V(yM)/dy2=  (l/es)[qPxx  +  (mobile  charge  terms)]  =  C 
It  is  possible  from  Pao-Sah  to  calculate  dV(yM)/dy=EMps  [38].  This  gives  us  the  following 
equations  after  integrating  the  Poisson's  equation  twice  with  the  above  boundary  conditions: 

<vds  "  vm>    =    (C/2)  (L  -  yM)2   -   EM(L  -  y„)  (2.20) 

This  reduces  all  the  boundary  conditions  to  one  equation  with  two  unknowns  (yM  and  VM). 
The  ideal  additional  equation  would  be  d2V(yM)/dy2  on  the  Pao-Sah  side,  but  this  quantity 
is  incalculable  from  the  Pao-Sah  integral. 

If  it  is  assumed  that  assume  EM=0  (as  was  done  in  Reddi-Sah),  the  depletion  length 
into  the  channel  can  be  easily  found.  It  is  reasonable  to  assume  that  the  lateral  field  at  the 
matching  point  (EM)  is  much  less  than  the  field  right  at  the  drain  (ED),  so  ED  »  EM,  making 
the  difference  in  yM  small.  This  gives  (from  2.20,  also  2.17) 
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yM   =    L   -    (2(VDS   -   VM   +   Vbl)/C)1/2  (2.21) 

Where  Vbj  accounts  for  the  pre-existing  depletion  region  originating  from  the  abrupt  p/n 
junction.  Since  the  yM  approximation  has  already  been  made,  it  will  be  assumed  that  the 
field  throughout  the  drain  region  is  a  constant  at  the  boundary  and  is  given  by 

EMdeP   =    <VDS   -  VM)/(L  -  yM)  (2.22) 

Clearly  there  are  conflicting  assumptions  (EM  =  0,  and  now  EM  *  0).  One  might  wonder 
why  EM  is  not  (VDS  -  VM  +  Vbj)/(L  -  yM)  to  be  consistent  with  2.21.  This  comes  from  the 
subtlety  of  the  boundary  conditions.  Looking  back  to  Figure  2.2,  note  that  the  integration  is 
actually  from  Vs  +  Vbj  to  VDS  +  Vbi,  which  excludes  the  p/n  depletion  layers.  The  Vbi's 
cancel  out  for  symmetrical  devices,  so  this  is  no  problem.  At  VDS=0  (and  source  grounded), 
no  current  or  field  is  expected,  which  would  make  VM  correctly  equal  to  0  in  (2.22). 
However,  if  Vbi  were  added  to  (2.22),  then  VM  would  have  to  equal  Vbj>  which  would 
incorrectly  cause  a  field  (and  possibly  current  flow  depending  on  VGS).  Essentially,  (2.22) 
gives  the  excess  field.  However,  Vbi  does  contribute  to  the  depletion  width,  so  it  is  included 
in  (2.21). 

The  normalized  field  on  the  Pao-Sah  side  at  the  boundary  is  given  by  [32] 

[exp(Us)-l]exp(-U„-UF) 


2 

-    F(UE,UM,UF)     + 

v 


exp  (US-UM-UF)  -exp  (Up-Us)  +exp  (UF)  -exp  (-UF) 
UMfUs   exp(U-£-UF) 


-dUdf 
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where  UM  is  the  normalized  matching  voltage,  VM«(q/kT). 

Figure  2. 1 1  shows  the  results  of  this  approach,  with  mobile  charge  terms  neglected 
(C=qPxx/esi)  for  Pxx=5xl017  cm"',  T=300  K,  and  Tox=50  A.  The  data  cover  a  wide  range 
of  channel  lengths  from  'At  p;m  to  <*>,  and  for  all  cases  the  width  is  equal  to  the  length  (square 
devices).  The  saturation  current  predicted  by  long-channel  theory  for  these  square  devices 
would  be  the  same  for  all  channel  lengths,  so  the  deviation  from  this  is  due  to  channel- 
length  modulation,  which  clearly  becomes  more  important  and  the  channel  length  decreases. 
Figure  2. 12  shows  that  the  drain  conductance  (g^dl^/dV^)  is  smooth,  which  is  important 
for  circuit  simulator  applications.  Although  not  shown,  the  derivative  of  the  drain 
conductance  is  also  smooth.  Thus,  this  field-matching  model  successfully  extends  the  1-D 
Pao-Sah  model  to  short-channels,  at  least  with  regards  to  including  the  effective  channel- 
shortening  effect. 

Saturation-Voltage  Method 

Reddi  and  Sah  [27]  assumed  VM=VDSsat,  which  simplified  things  considerably. 
VDSsat  is  easy  to  calculate  when  using  the  bulk-charge  formula  assuming  VS0=2VF  since  the 
derivative  of  the  surface  potential  with  respect  to  the  drain  voltage  is  zero.  The  Pao-Sah 
current,  however,  does  not  technically  saturate  (numerically  there  will  be  a  point  where  the 
current  does  not  increase,  but  it  will  be  at  a  drain  voltage  well  in  excess  of  the  normal  VDSsat 
point).  This  problem  is  solved  by  extrapolating  VDSsat  from  dID/dVDS  versus  VDS  without 
channel  shortening.  Figures  2.13  and  2.14  show  the  results  of  employing  this  method  with 
the  same  device  as  used  in  the  previous  section  (Pxx=5xI017  cm-3,  T=300  K,  Tox=50  A), 
using  Eq.  2.21  for  yM  with  VM=VDSsat.  Clearly  the  channel-length  modulation  is  being 
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accounted  for,  but  the  transition  is  slightly  abrupt.  A  look  at  the  resulting  drain  conductance 
(Fig.  2. 14)  shows  a  drastic  discontinuity  near  the  calculated  VDSsal  point.  Use  of  a  fitting 
function  could  rectify  this  derivative  problem,  and  is  a  common  practice  for  compact 
models. 

Surface-Potential  Self-Saturation  Method 

Another  possible  way  to  circumvent  finding  the  VM  point  was  posed  by  Katto  and 
Itoh  [39],  Instead  of  finding  VDSsat,  they  used  the  fact  that  the  surface  potential  itself  will 
saturate  when  solved  iteratively  from  (2.8).  Thus  replacing  the  matching  voltage,  VM,  with 
the  surface  potential  at  the  drain,  VSL  (solved  iteratively)  gives  another  decoupled  way  to 
solve  for  yM.  Using  the  surface  potential  to  find  the  depletion  thickness  was  also  used  by 
Sah  [2].  This  is  better  than  the  VDSsat  method  since  there  will  not  be  an  immediate  point 
where  saturation  occurs.  However,  as  shown  in  Figs.  2.15  and  2.16,  the  current  still  has  a 
slight  'jump'  resulting  in  discontinuities  in  gD. 

In  Search  of  the  Match  Point 

Sah  [2]  showed  pictorially  that  in  saturation,  the  energy  band  near  the  drain  edge 
will  actually  be  bent  upward,  or  in  other  words,  the  surface  will  be  accumulated  rather  than 
inverted  (actually,  the  surface  will  still  be  depleted,  but  now  accumulation  refers  only  to  the 
shape  of  the  band  bending).  This  must  be  the  case  since  the  potential  along  the  channel  is 
actually  higher  than  VGS  -  VGT  =  VDSsa(.  This  means  that  there  must  be  a  point  along  the 
channel  at  which  the  band  bending  is  zero  at  the  surface,  and  this  point  would  be  an 
excellent  candidate  for  the  yM  point.  Like  the  methods  above,  however,  this  point  has  some 
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Fig-  2. 1 1  ID  versus  VDS  plots  for  different  channel  lengths  (square  devices)  using 

field  matching  at  the  match  point.   VGS  =  5  V,  Pxx=5xl017  cm-3, 
T=300K,  T    =50  A. 
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Fig.  2. 12  gD  versus  VDS  plots  for  different  channel  length  (square  devices)  using 

field  matching  at  the  match  point.  Same  parameters  as  Fig.  2.11. 


34 


<       3 

E 


-?      2 


I   I   I   |   l   l   l   l   |   I   I   I   I 

---L=0.25nm 

—  L=0.50  urn 

—  L=1 .00  urn 

—  -  L=5.00  (im 


I    I    I    I    l    I    l    l 


QtM     I     I     I     I     l     l     I     I     I     I     I     I     i     I     I     I 


10 


VD 


Fig.  2. 13  ID  versus  VDS  plots  for  different  channel  lengths  (square  devices)  using 

VM=iterative  surface  potential  at  drain.  VGS  =  5  V,  Pxx=5xl017  cm-3, 
T=300K,  T„  =50  A. 
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Fig.  2.14  gD  versus  VDS  plots  for  different  channel  length  (square  devices)  using 

VM=iterative  surface  potential  at  drain.  Same  parameters  as  Fig.  2.13. 
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Fig.  2.15  ID  versus  VDS  plots  for  different  channel  lengths  (square  devices)  using 

VM=VDSsar  VGS  =  5  V,  Pxx=5xl017  cm"3,  T=300K,  Tox=50  A. 


37 


2.0 


1.5 


1.0  - 


Q 


0.5 


0.0 


I  I 

1 

i 

1 1 

i  i  i  i  1  i  i  i  i 

i   l   l   I 

- 

- 

- 

- 

- 

i; 

---L=0.25|am 

- 

[i 

—  L=0.50  |im 

- 

ji 

—  L=1 .00  \irn 

_ 

ji 

1 

--U5.00n.rn 

- 

—  L=°o  jam 

- 

— 

\ 

• 

- 

- 

\  ■' 

- 

- 

\K\ 

- 

- 

\IK\ 

- 

- 

- 

_L    I 

„\ 

-F:FFP: 

•j- 

t=4 

t-t-i-d-4-i—i-^.-.-i- 

10 


Vr 


Fig.  2.16  gD  versus  VDS  plots  for  different  channel  length  (square  devices)  using 

VM=VDSsat.  Same  parameters  as  Fig.  2. 15. 
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logical  flaws.  For  instance,  the  field  in  the  x-direction  is  zero  by  definition,  which  means 
that  using  the  channel  potential  at  this  point  to  calculate  the  current  from  the  long-channel 
model  will  clearly  invalidate  the  gradual  channel  approximation  (Ex  »  EY),  a  basis  for  the 
Pao-Sah  ID  derivation. 

A  simple  approximation  for  this  point  would  be  to  use  VM  =  VGS  -  VFB  when  VDS 
>  VGS  "  VFB>  which  is  akin  to  setting  VDSsat  =  VGS  -  VFB.  This  ends  up  resulting  in  the 
same  sort  of  problem  seen  in  the  VDSsa,  method. 

It  is  interesting  to  verify  the  existence  of  this  turn-around  region  in  the  channel  near 
the  drain,  however.  This  was  done  recently  using  the  MINIMOS  device  simulator  [40]. 
MINIMOS  was  modified  to  use  a  constant  mobility  model  so  as  to  be  comparable  to  the  1- 
D  model  cases  above.  Figure  2.17  shows  the  resulting  electrostatic  potentials  into  the 
substrate  at  different  points  near  the  drain  edge  of  a  50A,  100x100  urn  (corrected  for 
subdiffusion)  nMOST  with  Pxx=5xl017  cm"3  at  VGS=1.5  V  and  VDS=3.0  V.  Vpg  was  fixed 
at  zero  for  this  case.  What  is  clear  is  that  the  band  moves  from  inversion  (top)  through 
flatband  into  accumulation  (bottom)  at  the  surface  (x=0.0).  The  flatband  point  occurs  when 
the  channel  potential  is  equal  to  VGS  -  VFB  =  1 .5  V,  as  expected. 

Summary 

This  chapter  reviewed  the  history  of  1-D  long-channel  drain-current  models  and 
discussed  the  pros  and  cons  of  their  derivation  and  applications.  From  this,  the  importance 
of  a  non-pinned  surface  potential  was  shown,  as  demonstrated  by  the  excellent 
approximation  of  the  simple  charge-sheet  model  to  the  Pao-Sah  double  integral-the  best  of 
the  1-D  long-channel  models. 
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Next,  methods  to  extend  the  I-D  into  two  1-D  sections  to  create  the  best  full-range 
1-D  model.  It  was  discovered  that,  no  matter  what,  the  depletion  region  is  strictly  2-D,  and 
obtaining  a  1-D  approximation  requires  rather  substantial  assumptions.  One  model,  the 
field-matching  approach,  was  seen  to  give  reasonably  good  characteristics,  while  all  the 
other  approximations  (VDSsaI  and  surface-potential  self-saturation)  resulted  in 
discontinuities  in  the  first  (and  higher)  derivatives. 

A  2-D  simulation  was  used  to  verify  that  there  is  a  point  in  the  saturated  channel 
where  the  (x-directed)  field  reverses  and  the  surface  band  bending  is,  thus,  zero.  This  point 
has  been  suggested  many  times  before  in  our  group,  but  never  verified  two-dimensionally. 
Attempting  to  use  this  point  to  demark  the  boundary  of  the  source  region  and  drain  region  of 
the  two-section  model  results  in  the  same  poor  results  as  the  VDSsat  method. 
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Fig.  2.17  Electrostatic  potentials  into  the  substrate  at  different  points  near  the 

drain  edge  of  a  50A,  100x100  um  (corrected  for  subdiffusion)  nMOST 
with  Pxx=5xl017  cm"3  at  VGS=1.5  V  and  VDS=3.0  V.  The  Y=1.321 
am  (near  source)  and  Y=50.000  um  (middle  of  the  channel)  curves  are 
indistinguishable.  The  band  is  flat  at  the  Si02/Si  surface  when  the 
channel  electrostatic  potential  equals  VGS  -VFB  =  1.5  V  (VFB  =  0  for 
this  data). 


CHAPTER  3 
POLYSILICON-GATE  MOS  LOW-FREQUENCY 
CAPACITANCE- VOLTAGE  CHARACTERISTICS 


Introduction 

For  modern  ULSI  technology,  polysilicon  gates  are  universally  used  on  MOS 
devices.  With  respect  to  MOS  device  characteristics,  there  is  no  advantage  to  substituting 
metal  gates  with  heavily-doped  polysilicon  (poly)  gates.  In  fact,  poly  gates,  as  will  be 
shown  in  this  chapter,  greatly  reduce  the  effectiveness  of  thinning  the  oxide  layer  to 
increase  the  drain  current.  The  use  of  poly  gates  is  a  question  of  cost  as  well  as 
performance,  however,  and  poly  gates  have  some  tremendous  processing  and  density 
benefits  over  metal  gates.  Polysilicon  gates  can  withstand  high  temperature  steps  that 
would  cause  most  deposited  metal  gates  to  evaporate,  particularly  the  source/drain  drive- 
in  step.  Polysilicon  gates  also  allow  for  self-alignment  of  the  gate  over  the  oxide  between 
the  source  and  drain,  removing  what  would  be  the  most  difficult  (and  costly)  alignment 
step  in  the  process  flow  [41-42]. 

This  chapter  covers  the  derivation  of  a  Fermi-Dirac-based  polysilicon-gate 
MOS  low-frequency  capacitance-voltage  model.  This  model  will  be  used  to  illustrate  the 
effects  of  polysilicon  gates  on  MOS  low-frequency  (LF)  capacitance-voltage  (CV) 
characteristics  compared  to  metal-gate  LFCV  characteristics.  A  useful  application  for  the 
model  is  physical  parameter  extraction,  which  is  demonstrated  in  this  chapter  using  two 
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different  methodologies:  3-point  fit  and  3-region  fit.  Sample  parameter  extractions  for 
thick  (130A)  and  thin  (20A)  gate  oxides  are  shown,  and  discussion  about  limitations  of 
the  model  are  presented.  Quantum  effects  are  purposely  ignored,  and  the  reasoning 
behind  this  decision  is  discussed.  Important  details  related  to  fast  convergence  of  the 
parameter-extraction  routines  are  also  given. 

Metal-Gate  CV 

Ideal  metal-gate  CV  theory  using  Boltzmann  statistics  has  been  extensively 
discussed  [3,43],  as  well  the  extension  to  include  Fermi-Dirac  carrier  distribution  and 
deionization  effects  [44-46].  The  appendix  contains  the  full  metal-gate  LFCV  model 
derivation,  taking  Fermi-Dirac  statistics  and  deionization  into  account.  The  relevant 
solutions  are  given  below. 

Figure  3.1  shows  a  schematic  diagram  of  an  ideal  metal-gate  MOS  device  and 
the  corresponding  band  diagram.  From  Figure  3.1  (b),  as  explained  in  the  appendix, 
Kirchkoff  s  voltage  law  around  the  loop  gives: 

%  *   vo   =    Xs  ~  vix   +    (Ec   -  Ei>/<3   +   VF   +   VG  (3.1) 

where  Om  is  the  work-function  for  the  metal,  V0  is  the  potential  drop  across  the  oxide,  Xs 
is  the  electron  affinity  of  the  substrate,  Ec  and  E,  are  the  conduction-band  edge  and 
intrinsic  energies,  respectively,  in  the  substrate,  and  VF  is  the  Fermi  voltage,  which  is 
equivalent  to  (Ej  -  Fp)/q  for  p-type  material,  where  Fp  is  the  quasi-Fermi-level  for  holes 
and  q  is  the  electron  charge.  Collecting  these  terms  in  cleaner  form  gives 

(3.2) 
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Fig.  3. 1  MOS  capacitor  schematic  and  corresponding  energy-band  diagram.  (A) 

Schematic  diagram  of  a  MOS  capacitor  and  (B)  corresponding  energy- 
band  diagram  depicting  the  potential  drops.  Shown  is  a  positive  voltage 
V0  applied  at  the  gate,  resulting  in  the  Si02/Si  surface  entering 
inversion. 
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(Xs  +  (Ec  -  Ej-)/q  +  VF)  is  the  work- 
function  difference  between  the  metal  and  the  semiconductor.  As  will  be  shown  in  the 
next  section,  the  work  function  difference  for  a  polysilicon-gate  MOS  device  is  much 
simpler  than  metal-gate  MOS  since  the  substrate  and  gate  materials  are  the  same.  The 
drop  across  the  oxide  can  be  found  from  Gauss's  Law  requirements  as 

V0   =    esEIX/C0  -    (Q0T   +   QIT)/C0,  (3.3) 

where  es  is  dielectric  constant  of  the  semiconductor  (~1 1.7x8. 85xl0~14  F/cm2  for  Si),  E,x 
is  the  field  across  the  oxide,  C0  is  the  oxide  capacitance,  and  Q0T  and  QIT  represent  fixed 
and  interface  trapped  oxide  charge  respectively.  With  this  relation,  (3.2)  can  be  rewritten 
as 

VG      =  VFB  +  VIX  +   esEIX/C0,  (3.4) 

where  VFB,  the  flat-band  voltage,  is  given  by 

VFB     =     ^MS    -      <Q0T     +     QlT>/C0.  O-5) 

For  metal-gates,  there  is  no  capacitive  contribution  from  the  metal,  so  the  gate 
capacitance  is  simply  the  series  equivalent  of  the  fixed  oxide  capacitance,  C0,  and  the 
variable  substrate  capacitance,  Cix. 

Cg  =  CixC0/(Clx  +  Co).  (3.6) 

The  field  going  into  the  substrate,  EjX,  and  the  substrate  capacitance,  Cix,  are  given  by 
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2kT 


Nv[r3/2(-UIX-Uv+UF)    -    r3/2(-Uv+UF)] 

+    Nc[y-3/2(    UIX+UC-UF)    -    <F3/2<    UC-UF)] 
[1  +  gAexp(UF  -  UA  —  UIX) 


Pxx<    UTX   +    In 


Nxx(-UtX   +    In 


1  +  gAexp(UF  -  UA) 

1  +  gDexp(UD  -  UF   +   UIX) 


1  +  gDexp(UD  -  Up) 


-Nv£/2(-UIX-Uv+UF)    +   NC^I/2(UIX+UC-UF) 


(3.7) 


1  +  gAexp(UF  -  UA  -  UIX) 


(3.8) 


n  +  gDexp(UD  -  UF  +  UIX) 
where  all  'U'  values  are  potentials  normalized  to  kT/q  and  referenced  to  the  intrinsic 
Fermi  level.  For  example,  UF  is  the  normalized  Fermi  level,  qVF/(kT).  Pxx  is  acceptor 
substrate  doping  concentration  and  Nxx  is  the  donor  substrate  doping  concentration,  and 
gA  and  gD  are  the  corresponding  degeneracy  factors  for  the  trap  levels  UA  (acceptor 
energy  level)  and  UD  (donor  energy  level,  not  to  be  confused  with  the  normalized  drain 
voltage  of  an  MOS  transistor).  Nv  is  the  valance  band  density  of  states  and  Nc  is  the 
conduction  band  density  of  states. 

Those  familiar  with  MOS  capacitance  equations  might  find  these  far  more 
complex  than  they  recall;  a  perusal  of  the  appendix  should  clear  up  any  questions  about 
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this  form.  However,  it  is  instructive  to  show  how  this  reduces  to  a  more  familiar 
Boltzmann  form.  First,  all  of  the  Fermi-Dirac  integrals  [F,/2(T|)  and  F3/2(r|)  terms]  reduce 
to  exponentials  in  the  Boltzmann  range  of  applied  gate  voltages  (T|  <  -4).  Second,  there 
is  typically  only  one  dominant  dopant,  so  one  of  the  last  two  terms  in  (3.7)  and  (3.8)  can 
be  neglected  (the  first  can  be  neglected  for  n-type  substrate,  and  the  second  for  p-type 
substrate).  Furthermore,  if  deionization  is  neglected  (UF  -  UA  -  UIX  <  -3  for  p-type  or 
UD  -  UF  +  UIX  <  -3)  for  n-type),  then  the  last  two  terms  of  (3.7)  reduce  to 
pxxuix  ~  Nxxuix-  Likewise,  the  two  lines  of  (3.8)  reduce  to  Pxx  -  Nxx  when 
deionization  is  neglected. 

As  an  example  of  the  simplified  form,  let  us  consider  a  p-type  substrate  in 
strong  accumulation.  In  this  case,  it  can  be  assumed  that  only  the  accumulated  surface 
carrier  term  is  dominant  (Ulx  is  large  and  negative).  Noting  also  that,  in  the  Boltzmann 
case,  UF  -  Uv  =  ln(Pxx/Nv),  (3.7)  and  (3.8)  would  reduce  to 


Ej|   =   \  (2kTPxx/£s)exp(UIX/2) 

cix   ■   qHPxx£s/(2kT)]exp(UIX/2) 
These  are  the  more  tractable  strong-accumulation  forms  found  in  undergraduate 
textbooks  [3,  43]  and  which  form  the  basis  for  one  well-known  oxide-thickness 
extrapolation  algorithm  [47]. 

Polysilicon-Gate  CV 

Implicit  in  the  derivation  of  the  metal-gate  CV  theory  above  was  that  the 
capacitance  of  the  gate  is  infinite  and  that  the  voltage  drop  across  the  gate  is  zero.  With 
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metal  gates,  this  is  a  reasonable  assumption  for  the  ideal  isolated  device.  However,  with 
polysilicon  gates,  there  is  a  finite  polysilicon  gate  capacitance  as  well  as  a  voltage  drop 
[3,  49].  Indeed,  the  capacitor  is  now  a  semiconductor-oxide-semiconductor  device,  so  it 
will  have  a  corresponding  surface  potential  for  the  gate,  as  well  as  an  associated  gate 
capacitance  with  a  form  exactly  like  the  substrate  capacitance. 

This  requires  only  minor  additional  derivation  to  arrive  at  the  poly-gate  MOS 
capacitor  (MOSC)  ideal  device  characteristics.  Figure  3.2  shows  the  band  diagram  for  an 
n+-polysiIicon  gate  MOS  capacitor  with  a  p-type  substrate  (a  schematic  of  the  device 
would  be  identical  to  3.1  (a),  with  a  metal  gate  replaced  by  a  polysilicon  gate).  From  this 
figure  it  is  clear  that  the  potential  drop  across  the  device  can  be  given  similarly  to  (3.1)  as 

-vF-Poiy  +  <Ec  -  EI>/<2  +  xs  +  vIG  +  v0 

=   Xs  -  VIX   +    (Ec  -  Ex)/q   +   VF   +   VG.  (3.9) 

Assuming  the  energy  gap  has  not  narrowed  due  to  the  higher  gate  doping,  the  (Ec  -  E,) 
terms  are  identical  and  cancel  because  the  materials  are  both  silicon.  The  electron  affinity 
is  the  same  for  both  the  gate  and  substrate  for  the  same  reason.  This  reduces  (3.9)  to 

vg   =   V0   +   VIX   +   VIG   +   VF   +   VF.poly.  (3.10) 

Thus,  for  the  poly-gate  case,  <|>MS  (more  aptly  called  <J>GS,  where  'G'  represents  the  gate, 
but  still  traditionally  referred  to  as  'M'  for  metal)  is  simply  given  by 

0MS     =     VF     +     VF_poly.  (3.H) 

For  Figure  3.2,  $m  is  given  by  ln(PxxNGG/n?),  where  Pxx  is  the  substrate  doping  ('P' 
implying  p-type)  and  N0G  is  the  gate  doping  ('N'  implying  n-type).  This  simple  formula 
assumes  a  Boltzmann  carrier  distribution  in  the  substrate  and  gate,  which  is  invalid  in  the 
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Fig-  3.2  Band  diagram  of  n+  polysilicon-gate  MOS  capacitor  with  all  the 

potential  drops  labeled.  The  band  diagram  shown  depicts  a  positive 
voltage  VG  applied  at  the  gate,  with  the  Si02/Si  surface  entering 
inversion  and  the  poly-Si/Si02  surface  depleting. 
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gate  due  to  the  high  doping  and  likely  invalid  substrate  for  modern  ULSI  devices.  A 
more  appropriate  formula  using  inverse  Fermi-Dirac  integrals  can  be  used  using  the 
examples  in  the  appendix. 

The  extra  potential  drop  from  the  poly  gate  is  easily  taken  into  account  via 
Kirkoff's  law  with  VjG: 

VG    =   VFB    +    VIX    +   VIG    +    £SEIX/C0,  (3.12) 

where  VFB  from  (3.5)  still  holds  assuming  negligible  contribution  from  the 
polysilicon/oxide  interface,  using  (3.1 1)  for  <|>MS.  Finally,  the  gate  capacitance  formula 
needs  to  be  extended  for  three  capacitors  in  parallel.  This  changes  (3.6)  to 

Cg   =   CixC0Clg/(CixC0   +   ClxCig   +   ClgC0),  (3.13) 

where  Cig,  the  capacitance  from  the  polysilicon  gate,  is  given  by 

q 

-Nv£/2(-UIG-Uv+UF)     +    NCJ1/2(UIG+UC-UF) 


1  +  gAexp(UF 


Ua  -  UIG) 


(3.14) 


1  +  gDexp(UD  -  UF  +  UIG) 
This  is  simply  (3.8)  re-written  with  the  band  notation  for  the  gate.  Thus,  UI0  is  the 
normalized  surface  potential  in  the  gate,  EIG  is  the  field  in  the  gate  (defined  below),  and 
PGG'  ngg-  uv-  uo  No  Nv>  8a>  Sd-  ud-  ua  are  precisely  as  defined  before,  except  that 
they  apply  now  to  the  gate  rather  than  the  substrate.   UF  above  was  called  UF    0, 
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elsewhere-it  is  left  as  UF  in  (3.14)  to  maintain  the  symmetry  of  the  equation.  The  gate 
field  is  given  by 

2kT 

Nv[^3/2(-UIG-Uv+UF)    -   J3/2(-Uv+UF)] 


+    Nc[^3/2(    UIG+UC-UF)    -    J3/2(    UC-UF)] 
[1  +  gAexp(UF  -  UA  -  UIG) 


Pgg<    Ut 


In 


NGG<-UIX     +     ln 


1  +  gAexp(UF  -  UA) 

1  +  gDexp(UD  -  UF   +   UIG) 


,(3.15) 


Ll  +  gDexp(UD  -  UF) 
which  is  identical  to  (3.7)  with  the  surface  potentials  changed.  Again,  the  same  caveat 
applies  to  (3.15)-all  the  terms  refer  to  the  gate  now,  not  the  substrate.  Things  like  trap 
levels  and  band  edges  are  nearly,  if  not  exactly,  the  same  in  the  substrate  and  polysilicon 
gate.  However,  UF  is  clearly  quite  different  (assuming  the  gate  and  substrate  are  not 
doped  identically,  which  would  make  a  poor  capacitor  or  transistor). 

An  additional  equation,  which  was  not  needed  in  the  metal-gate  case,  is  required 
to  relate  the  gate  and  substrate.  This  equation  equates  the  charge  density  at  the  gate/oxide 
interface  with  the  charge  density  at  the  oxide/substrate  interface: 

£SEIX     +     QlTX     +     ESEIG     +     QlTG     =     0. 

QITX  is  the  interface  charge  at  the  substrate/insulator  interface  and  QITG  is  the  charge  at 
the  gate/insulator  interface.  It  is  assumed  that  these  values  are  negligible,  and  that  the 
dielectric  constant  for  the  silicon  substrate  and  the  silicon  gate  are  identical  (already 
implicitly  assumed  in  the  equation).  This  gives  the  following 
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which  allows  the  surface  potential  in  the  gate  to  be  related  to  the  surface  potential  in  the 
substrate.  The  iterative  solution  of  the  above  equation  requires  many  calculations  of  (3.7) 
and  (3.15),  and  is  the  most  time-consuming  part  of  the  LFCV  solver  as  well  as  any 
software  using  the  routine  (such  as  a  parameter  extractor  which  works  by  comparing  the 
data  to  the  theoretical  curve,  as  discussed  later  in  the  application  section). 

Polysilicon-Gate  Effects 

The  effect  of  polysilicon  gates,  compared  to  metal  gates,  is  a  reduction  of  the  gate 
capacitance,  Cg,  when  the  gate  is  in  depletion.  This  is  arises  when  the  value  of  Cj  falls 
below  that  of  C0  and  Cix,  which  only  occurs  during  gate  depletion  and  substrate  inversion 
or  accumulation,  and  only  then  to  a  significant  degree  for  thin  oxides.  This  is  easily 
visualized  from  the  three  series  capacitances-the  one  which  dominates  is  the  smallest, 
and  the  capacitance  due  to  the  substrate  and  gate  are  both  minimized  during  depletion 
(and  maximized  during  accumulation,  as  well  as  inversion  for  the  LF  case).  As  oxides 
thin,  the  oxide  capacitance  increases,  which  causes  the  effect  of  the  substrate  and  gate 
depletion  to  have  more  control  over  the  characteristics  of  the  C  -VG  curve. 

Figure  3.3  shows  the  difference  between  metal-gate  and  polysilicon-gate  data, 
normalized  to  C0,  for  two  different  technologies.  The  'higher'  pair  of  curves  for  a  1000 
A  oxide  (thick  oxide  means  low  oxide  capacitance)  shows  little  difference  between 
polysilicon  gates  and  metal  gates.  The  lower  pair  of  curves  for  a  50  A  oxide  (thin  oxide 
means  large  oxide  capacitance)  shows  a  large  decrease  in  Cg  for  all  values  of  VG, 
particularly  for  VG  >  1 V,  where  the  gate  is  still  in  depletion  and  the  substrate  is  inverted. 
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Comparison  of  metal-gate  and  n+  poly-gate  MOSC  curves  for  two 
different  technologies.  One  set  has  1000  A  oxide  with  Pxx=3xl016 
cm-3  and  the  second  set  has  50  A  oxide  with  Pxx=2xl017  cm"3.  In  each 
case,  the  VFB  is  adjusted  to  be  -1.0V  and  the  gate  doping  is  3xl017 
cm-3.  Clearly  shown  is  the  dramatic  difference  between  poly-gate 
(dotted  line)  and  metal-gate  (solid  line)  for  the  50A  case,  and  the 
negligible  impact  on  the  1000  A  case— the  polysilicon  gate  effects 
increase  as  the  oxide  scales  thinner. 
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This  continual  decrease  in  C  for  increasing  VG  (in  this  n+  poly-gate  on  p-Si  substrate)  is 
often  referred  to  as  'poly  depletion,'  since  the  polysilicon  gate  is  still  depleting. 
Eventually  the  gate  itself  will  invert,  and  the  characteristics  will  be  much  improved. 
However,  resulting  field  caused  by  the  gate  voltage  required  to  invert  the  gate  is  typically 
beyond  the  reliability  limit  of  4MV/cm  in  properly  scaled  devices.  In  fact,  the  only  way 
to  make  the  gate  invert  sooner  is  to  lower  the  gate  doping,  which  exaggerates  the  poly 
depletion  effect  even  more  until  the  gate  inverts. 

It  might  seem,  as  it  did  to  this  author,  that  the  ultimate  solution  would  be  to  use 
undoped  gates,  as  they  would  invert  much  sooner  and  behave  just  like  metal  gates  at 
reasonably  low  applied  gate  voltages.  This  works  well  in  simulation,  but  the  question 
then  becomes:  where  is  the  supply  of  minority  carriers  to  invert  the  gate?  In  particular,  for 
an  n+gate  in  a  rapidly  switching  MOST,  what  would  supply  the  holes?  It  has  been  shown 
that,  for  at  least  one  technology,  the  holes  are  likely  supplied  via  thermal  generation 
(rather  than  ion  impact)  [50].  Thermal  generation,  then,  could  not  supply  the  holes  fast 
enough  for  practical  use  of  an  undoped  gate.  However,  it  might  be  possible  to  design  in  a 
minority  carrier  source  nearby  to  supply  minority  carriers  (similar  to  how  the  source  and 
drain  supply  minority  carriers  in  the  substrate). 

The  reduction  in  the  gate  capacitance  due  to  poly  depletion  causes  a  reduction  in 
the  drive  current,  which  degrades  circuit  performance  [51-54],  since  the  amount  of 
current  supplied  by  the  transistor  directly  relates  to  the  switching  speed  of  the  device.  In 
a  complementary-MOS  (CMOS)  circuit,  the  current  charged  up  the  interconnect  and 
intrinsic  capacitances  of  the  next  transistors  in  the  line,  as  discussed  in  detail  in  Chapter 
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4.  Because  of  this  poly-gate  ID  reduction,  there  may  eventually  be  a  move  back  to  metal 
gate  (or  silicides)  once  the  processing  issues  of  gate  alignment  are  solved. 

It  is  instructive  to  look  at  the  individual  capacitance  components  to  see  how  the 
'complex'  poly  LFCV  curve  forms.  Figure  3.4  shows  such  a  curve  for  a  theoretical  50.0 
A  oxide  with  an  n+  gate  doped  (rather  lowly)  to  9xl018  cm"3  and  a  substrate  doped  to 
5xl017  cm"3.  The  gate  area  is  lxlO"4  cm2  and  the  flatband  voltage  is  -1 .0  V.  The  Cg 
curve,  being  the  serial  sum  of  C0,  Cix,  and  Cj  (Eq.  3.17),  is  always  lower  than  the 
component  curves.  It  can  be  clearly  seen  how  each  of  these  three  components  influences 
the  overall  structure  of  the  resulting  gate  capacitance.  In  fact,  this  'regional'  effect  will 
be  used  to  help  speed  up  parameter  extraction  in  the  next  section. 

Also  of  interest  is  a  breakdown  of  potentials  across  the  MOSC  device  as  a 
function  of  VG.  Figure  3.5  shows  the  four  components  of  VG,  namely  VIX,  VIG,  Vox, 
and  VFB  (see  Eq.  3.12)  as  a  function  of  VG  using  the  same  parameters  as  the  example  in 
the  last  paragraph.  To  show  show  these  are  related  to  the  resulting  gate  capacitance,  the 
C  -VG  curve  is  also  plotted.  What  is  most  relevant  in  this  figure  is  that  as  the  primary 
'dip'  in  the  CV  curve  occurs  as  the  surface  potential  in  the  substrate,  V|X,  sweeps  from 
accumulation  to  inversion  (i.e.  moves  from  a  small  negative  number  to  about  one  volt), 
and  ends  sharply  as  the  surface  potential  approaches  its  maximum  (strong  inversion). 
Similarly,  the  secondary  polydepletion  'dip'  occurs  as  the  gate  surface  potential,  VIG, 
moves  from  accumulation  to  inversion  (again,  moves  from  a  small  negative  voltage  to 
around  a  volt).  Note  that  the  final  surface  potential  in  the  gate  is  higher  than  that  in  the 
substrate  (V|G  >  VIX  when  VG  >  4V).  This  agrees  with  the  common  approximation  that 
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Fig.  3.4  Individual  capacitance  values  for  a  theoretical  100x100  urn  nMOSC 

with  a  50  A  gate  oxide,  Pxx=5.0xl017  cm"3,  NGG=9xl018  cm-3,  T=300 
K,  and  VFB=-1.0  V.  This  figure  demonstrates  how  the  three  parallel 
capacitances  (Cjx,  Cig,  and  C0)  add  to  give  the  overall  gate  capacitance. 
See  Fig.  3.5  for  the  corresponding  potential  breakdown. 
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Fig.  3.5  Individual  potential  breakdown  for  a  theoretical  100x100  |im  nMOSC 

with  a  50  A  gate  oxide,  Pxx=5.0xl017  cm"3,  NGG=9xl018  cm"3,  T=300 
K,  and  VFB=-1.0  V,  along  with  the  corresponding  LFCV  curve.  Note 
how  the  surface  potential  in  the  substrate,  V|X,  increases  rapidly  in  the 
range  VG  =  -1  to  0  V  as  Cg  increases  (substrate  inversion)  and  the 
similar  increase  in  V!G  in  the  range  VG=1  to  3  V  (gate  inversion).  See 
Fig.  3.4  for  the  corresponding  capacitance  breakdown. 
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the  surface  potential  pins  to  a  little  over  2VF,  since  the  Fermi  voltage  in  the  gate  will  be 
larger  than  that  of  the  substrate  due  to  the  greater  gate  doping. 

Parameter  Extraction  Using  the  LFCV  Model 

Of  the  multitude  of  variables  in  the  LFCV  equations,  most  of  them  are  known  to  a 
reasonable  degree  of  accuracy  (such  as  the  dielectric  constant,  energy  gap,  conduction- 
band  density,  etc.),  can  be  measured  easily  (temperature),  or  need  not  be  known  very 
accurately  (acceptor  and  donor  trap  level)  due  to  their  small  effect.  This  leaves  the  gate 
and  substrate  doping,  the  oxide  thickness,  and  the  flatband  voltage  as  the  'unknown' 
parameters. 

These  parameters  may  be  extracted  from  experimental  data  by  comparing 
experimental  data  to  the  theoretical  model  presented  in  this  chapter.  This  may  appear  to 
be  an  easy  task,  since  the  equation  need  only  be  used,  along  with  some  data,  in 
conjunction  with  a  non-linear  least-squares-solver.  However,  one  will  note  that  the 
polysilicon  gate  LFCV  formula  is  doubly  parametric  (that  is,  is  related  through  two 
parameters— the  surface  potentials  Ulx  and  U]G),  neither  of  which  are  known  from  the 
data.  Thus,  solving  this  problem  is  non  trivial. 

The  first  step  toward  a  solution,  then,  is  to  write  a  program  which  will  calculate 
C  given  VG.  This  requires  intensive  calculations  to  find  U,x  and  U|0  for  each  VG,  but 
can  be  done  since  there  is  only  one  unique  solution.  Thus,  with  a  C  (V0)  routine  written, 
a  nonlinear  least-squares-fit  program  can  be  used.  The  code  written  for  this  dissertation 
took  advantage  of  the  fact  that,  as  the  solution  converges  to  values  of  the  unknown 
parameters,  the  values  of  the  surface  potentials  at  each  experimental  data  point  could  be 
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used  for  initial  guesses  for  each  subsequent  iteration  of  VG  to  find  each  C  (since  the 
parameters  (substrate  and  gate  doping,  oxide  thickness,  and  flatband  voltage))  should  not 
be  changing  too  rapidly).  This  greatly  increased  the  convergence  rate  over  estimating 
UIX  and  U|G  on  each  call,  at  the  expense  of  additional  code  complexity  and  memory 
usage. 

3-Point  Extraction  Methodology 

If  the  model  were  perfect,  then  it  would  require  only  three  points  to  match  the 
experimental  data  to  the  model.  Why  only  three  data  points  for  four  parameters? 
Because  the  additional  constraint  that  one  of  the  points  should  be  the  minimum  of  the 
experimental  LFCV  curve  can  be  used.  From  this  information,  the  flatband  can  be  found 
by  comparing  the  VG  of  the  theoretical  minimum  with  the  V0  of  the  experimental 
minimum.  The  other  three  parameters  can  be  found  directly  from  the  C„  values  of  the 
three  points.  Figure  3.6  shows  the  three  points,  labeled  Cg.acc,  Cg.depl,  and  Cg_d  2,  as 
they  relate  to  the  whole  LFCV  curve.  Only  Cg.depi  is  unique-the  other  two  points  can  be 
anywhere  within  their  region. 

The  Cg.acc  point  is  a  point  from  the  LFCV  gate  accumulation  region.  From  this,  a 
good  estimate  of  the  oxide  thickness  can  be  found,  since  the  other  parameters  have  very 
little  influence  over  this  point  (see  Figs.  3.7  and  3.8).  Cg.acc  asymptotically  approaches 
C0,  which  is  inversely  proportional  to  the  oxide  thickness,  Tox,  via  the  parallel  plate 
formula.  There  has  been  much  research  in  obtaining  Tox  and/or  C0  from  (substrate) 
accumulation  CV  data  [48,55-58]. 
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Fig-  3.6  Example  of  a  genera]  polysilicon-gate  (n+  gate,  p  substrate  for  this  case) 

showing  how  all  the  important  regions  can  be  labeled  in  terms  of  the 
gate  state  rather  than  the  typical  substrate  state.  This  regional 
breakdown  is  used  to  improve  the  speed  and  accuracy  of  the  parameter 
extraction  routine. 
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The  C„.dep|  point  is  the  minimum  of  the  LFCV  curve,  and  allows  us  to  find  the 
substrate  doping,  since  the  substrate  depletion  region  is  strongly  dependent  on  the 
substrate  doping  concentration.  In  fact,  depletion  CV  data  can  also  be  used  to  determine 
the  substrate  doping  profile  [59-61],  Figure  3.7  shows  LFCV  data  for  several  different 
constant  substrate  doping  concentrations,  clearly  demonstrating  the  strong  dependence  of 
substrate  doping  on  the  location  of  C  de  t.  This  was  also  demonstrated  in  Fig.  3.4,  since 
the  main  influence  in  this  depletion  region  is  C]x,  which  itself  is  strongly  dependent  on 
Up  (see  Eq.  3.8),  which  is  directly  related  to  the  the  inverse  Fermi-Dirac  integral  (natural 
logarithm  if  assuming  a  Boltzmann  distribution)  of  the  substrate  doping.  The  position  of 
the  minimum  along  the  VG  axis  also  allows  us  to  estimate  the  flatband  voltage  by 
comparing  the  VG  of  the  minimum  of  the  theoretical  curve  to  the  V0  of  the  data. 

The  C  d  2  point  is  from  the  gate  depletion  region.  Figure  3.8  shows  that  the  gate 
doping  has  the  most  affect  on  this  part  of  the  curve,  whereas  Figure  3.7  shows  that  the 
substrate  doping  has  very  little  effect  in  this  region.  For  the  n+  gate  on  p-substrate 
example  in  Figure  3.8,  the  substrate  is  in  inversion.  However,  even  if  the  substrate  were 
n-type  (and  the  substrate  thus  accumulated),  C  depletion  would  still  occur  because  the 
gate  would  still  be  in  depletion  (of  course,  the  entire  curve  would  be  shifted  due  to  the 
flatband  difference).  Hence,  this  point  is  called  C_de  2,  with  the  'dep'  in  reference  to  the 
depleted  state  of  the  gate. 

By  varying  the  parameters  in  the  appropriate  regions  to  match  these  three  points,  a 
unique  parameter  set  will  be  obtained  which  will  describe  a  theoretical  LFCV  curve 
passing  through  the  three  points. 
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Fig.  3.7  Effect  of  substrate  doping  changes  (Pxx)  on  LFCV  characteristics.  The 

'depletion- 1'  region  (see  Fig.  3.6)  is  the  region  of  largest  impact. 
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Fig.  3.8  Effect  of  gate  doping  changes  (NGG)  on  LFCV  characteristics.  The 

'depletion-2'  region  (see  Fig.  3.6)  is  the  region  of  largest  impact. 
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3-Region  Extraction  Methodology 

As  good  as  our  model  is,  there  are  still  several  effects  which  are  not  being 
considered.  These  include  retrograde  doping  in  the  substrate  and  quantum  effects  in  the 
substrate  inversion  channel.  Retrograde  doping  is  commonly  used  for  sub-'/2-micron 
design  to  maintain  a  high  sub-surface  doping  concentration  to  prevent  punchthrough, 
while  still  maintaining  a  low  VT  for  low-VG  operation  (to  accommodate  the  thin  oxides) 
[7].  Figure  3.9  shows  an  example  of  a  retrograde  profile  from  our  internally-modified 
MINIMOS.  The  LFCV  model  assumes  a  constant  doping  profile  in  both  the  substrate 
and  gate,  and  so  deviation  from  this  assumption  will  cause  changes  in  the  experimental 
LFCV  curve  relative  to  the  theoretical  model. 

Charge-carrier  layer  push-out  due  to  quantum  effects  in  the  inversion  and 
accumulation  layers  has  been  an  area  of  much  research  [62-65].  Experimental 
verification  of  these  quantum  effects  are  invariably  at  low  temperatures,  where  phonons 
will  not  broaden  the  quantum  bands  into  a  continuum.  Although  some  amount  of 
quantum  effect  is  likely  present,  it  is  probably  impossible  to  model  correctly  when  one 
considers  thermal  broadening,  Si02/Si  interface  roughness  and  transitional  regions,  non- 
random  dopant  distribution,  and  other  non-idealities.  These  will  all  tend  to  broaden  the 
electron  levels  into  a  more  classical  continuum. 

It  has  been  noted  that  electrical  and  optical  oxide  thicknesses  do  not  often  agree, 
and  the  difference  has  been  attributed  to  quantum  effects.  As  will  be  discussed  later,  the 
effect  is  likely  overestimated.  More  important,  if  there  is  a  difference,  it  is  the  electrically 
effective  oxide  thickness  (as  determined  from  electrical  experiments,  such  as  CV) 
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Fig.  3.9  Sample  of  retrograde  doping  profile,  showing  low  surface  concentration 

(5xl016  cm"3)  and  higher  bulk  concentration  (lxlO18  cm-3). 
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which  is  most  important  compared  to  the  optical  thickness  (which  is  not  what  affects 
device  performance). 

Due  to  these  two  main  non-idealities  (non-constant  doping  and  quantum  effects), 
there  could  be  some  dependence  on  the  extracted  parameters  using  only  three  points. 
That  is,  extracted  parameters  might  be  dependent  on  which  points  we  choose  for  C„.acc 
and  Cg.dep2.  To  overcome  this,  the  entire  curve  could  be  fit  to  the  model.  This  would 
result  in  extremely  long  convergence  times,  as  a  partial  derivative  must  be  calculated  for 
each  variable  at  every  point  for  every  iteration.  However,  Figures  3.7  and  3.8  show  that 
some  parameters  have  no  influence  on  the  LFCV  curve  in  certain  gate-voltage  regions. 
Thus,  the  information  provided  from  their  partial  derivatives  does  not  help  convergence, 
and  will  actually  slow  down  the  convergence,  not  to  mention  waste  time  during  the 
calculation. 

Instead  of  fitting  all  the  data  to  the  model,  the  data  can  be  broken  up  into  the  same 
three  regions  suggested  in  Fig.  3.6  for  the  three-point  fit.  Then  the  model  can  be  fit  using 
only  the  parameter  (or  parameters)  dominant  in  the  specific  region,  thus  greatly 
improving  the  convergence  rate  (since  the  data  being  used  is  most  relevant).  This  adds 
complication  to  the  coding,  as  the  data  partitioning  into  each  region  (discussed  later)  must 
be  automated,  and  a  different  fitting  routines  must  be  created  for  each  of  the  three  regions 
(same  model,  but  separate  partial  derivative  calculations). 

Methodology  Comparison 

Figure  3.10  shows  a  comparison  of  the  fit  using  the  3-point  and  3-region  methods 
to  experimental  data  for  a  130A  oxide  from  an  industrial  100x100  urn  MOST  transistor. 
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Fig.  3.10  Theoretically  generated  curves  compared  to  original  LFCV  data  using 

the  (A)  three-point  and  (B)  full-curve  extractions  to  on  an  n+polysilicon 
gate,  100x100  urn  nMOST.  Extracted  parameters  were:  (A)  XOX=130A, 
Pxx=9.0xl016  cm"3,  NGG=5.8xl019  cm"3,  and  VFB=-1.06V.  (B) 


XOX=130A,  Pxx=8.7xl016  cm"3,  NGG=3.0xl019  cm" 
VFB=-1.01V. 


and 
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The  solid-line,  of  course,  is  the  theoretical  curve  using  the  extracted  parameters,  and  the 
'x'  marks  are  the  data  used  for  the  extraction.  The  top  (A)  curve  uses  only  the  three 
points  marked  to  fit  the  data  while  the  lower  (B)  curve  uses  the  full  set  of  data.  The  RMS 
error  (for  the  second  curve),  calculated  from  the  square  root  of  the  sum  of  the  squares  of 
the  difference  between  theoretical  and  experimental  capacitance,  divided  by  the  square 
root  of  n  -  5  (5  =  degrees  of  freedom  =  1  +  number  of  fitting  parameters),  was  2.7%,  an 
excellent  fit  for  only  four  parameters.  There  is  not  much  in  literature  to  compare  the 
'goodness'  of  these  results,  as  there  are  not  many  capacitance  models  available  (most 
compact  models,  such  as  BSIM3  [34],  fit  IV  characteristics  only,  and  do  not  consider 
capacitance  extraction). 

Aggressively  scaled  MOSC  device  data  is  given  in  Figure  3.1 1.  This  shows  two 
LFCV  curves  from  a  ~20A  oxide  from  an  industrial  MOST  transistor,  where  the  20A  was 
determined  from  some  optical  method,  most  likely  ellipsometry.  This  is  a  rather  complex 
figure,  and  requires  some  explanation.  There  are  two  experimental  curves:  one  p+  poly- 
gate  and  one  n+  poly-gate,  both  on  a  p-well.  Thus,  in  the  V0  >  1 V  region,  the  n+  gate  is 
in  depletion,  while  the  p+gate  is  in  accumulation  (clearly  the  curves  were  shifted  to  align 
them,  as  the  flatband  should  differ  by  about  a  volt  between  the  two  curves,  although  a 
threshold  adjustment  implant  would  offset  this  somewhat).  From  looking  at  the  data  at 
1.3V  and  assuming  Cg  =  C0,  they  conclude  that  the  effective  oxide  thickness  is  33 A  for 
the  depleted  n+  gate,  and  28. 5A  for  the  accumulated  p+  gate  device.  This  is  a  poor 
approximation,  since  the  oxide  thickness  value  would  vary  greatly  at  different  points 
along  the  depleted  curve.  However,  they  correctly  state  that  the  difference  between  the 
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two  is  due  to  poly  depletion,  which  is  a  reasonable  statement  when  applied  to  that 
specific  gate  voltage  only.  Thus,  they  attribute  a  4.5A  reduction  in  effective  oxide 
thickness  due  to  polysilicon  depletion  at  1.3V,  which  is  the  operating  voltage  for  that 
technology.  They  next  assert  that  the  difference  between  the  accumulated-gate  curve 
(28. 5A  based  on  assuming  Cg  =  C0)  and  the  optically  measured  oxide  (20A)  must  be 
entirely  due  to  quantum  effects.  This  conclusion  is  wrong,  as  the  quantum  model  most 
likely  does  not  take  all  the  effects  mentioned  previously  into  account  (such  as  thermal 
broadening,  interface  roughness  and  transitional  region,  not  to  mention  retrograde 
doping).  More  importantly,  they  are  completely  ignoring  the  severe  error  of  using  C.  = 
Co' 

Figure  3.12  shows  the  fit  (using  the  three-point  method)  to  the  data  in  Figure  3.11. 
From  this  fit,  the  extracted  oxide  thickness  is  24.4A.  Compared  to  the  industry-stated 
results,  the  same  effective  oxide  thickness  reduction  due  to  poly  (4.7A  here  versus  4.5A) 
is  seen.  However,  by  correctly  accounting  for  the  distribution  (instead  of  assuming  C  = 
C0),  an  additional  4.1  A  reduction  from  the  Fermi-Dirac  distribution  is  also  found  (that  is, 
from  the  fact  that  Cg  <  C0)!  This  leaves  4.4A  of  difference  between  the  extracted  oxide 
thickness  of  24.4A  and  the  optically  measured  thickness  of  20A.  This  4.4A  difference 
may  include  some  quantum  effects,  but  it  may  also  be  due  to  optical  errors,  such  as  not 
accounting  for  the  transitional  layer  properly  [66-67]  and/or  some  other  effects  (i.e. 
doping  profile).  Ellipsometry  and  other  optical  methods  can  not  be  used  on  the  actual 
device  (since  the  gate  electrode  is  not  transparent),  so  it  does  not  measure  the  oxide 
thickness  in  the  active  part  of  the  device,  which  may  differ  slightly  due  to  the  additional 
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Fig.  3.1 1  P+poly/p-well  and  n+poly/p-well  '20A'  industrial  data.  Data  is  shifted 

to  align  minimums,  and  labelling  refers  to  industrial  interpretation  of  the 
two  curves.  Please  see  text  for  explanation  of  this  breakdown. 
Compare  to  Fig.  3.12,  which  is  the  author's  interpretation  of  the  same 
data  after  quantumless  parameter  extraction. 
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Fig.  3.12  Theoretically  generated  LFCV  data  three-point  parameter  extraction 

using  data  in  Fig.  3.1 1.  Extracted  parameters  are  shown.  Instead  of 
attributing  the  difference  between  the  optical  thickness  of  20.0A  and  the 
'extrapolated'  thickness  of  28. 5 A  at  VG=1 ,3V  to  quantum  effects,  we 
find  that  4.1  A  is  due  to  the  distribution  function  used  (Fermi-Dirac), 
with  the  remaining  4.4A  possibly  due  to  error  (in  optical  measurement 
and/or  other  factors). 
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processing.  As  far  as  parameter  extraction  is  concerned,  the  most  important  factor  should 
be  agreement  with  electrical  results,  not  optical. 

Although  a  three-point  method  was  used  here  (to  improve  the  match  at  the  1 .3V 
point),  a  full  fit  to  the  data  in  Fig.  3.12  has  about  a  10%  RMS  error,  which,  considering 
the  thin  oxide  and  the  fact  that  the  substrate  gate  doping  is  not  constant,  is  extremely 
good.  An  interesting  side  note,  brought  up  during  the  proposal  for  this  project,  concerned 
fitting  a  Boltzmann  model  to  the  data  instead  of  a  Fermi-Dirac.  The  surprising  result  of 
this  was  a  better  fit  (6.6%  RMS  error),  but,  as  would  be  expected,  a  thicker  extracted 
oxide  thickness  of  27. 5A.  This  is  an  interesting  result,  as  it  shows  that  using  the  wrong 
model  can  appear  to  give  'better'  results  (in  terms  of  fit),  even  though  the  resulting 
parameters  actually  have  greater  error  (due  to  the  incorrect  carrier  distribution). 

Philosophically,  the  issue  of  oxide  thickness  is  an  interesting  topic.  Many  people 
would  argue  that  TEM  is  the  only  way  to  measure  the  'true'  thickness.  However, 
ignoring  that  this  is  a  destructive  and  time-consuming  technique,  it  only  yields  the 
thickness  of  that  particular  cross  section.  What  is  desired  is  the  average  oxide  thickness, 
as  it  is  the  average  oxide  thickness  which  affects  the  amount  of  charge  accumulated  by  an 
applied  voltage  in  a  MOSC.  This  is  why  Fowler  Nordheim  (FN)  tunnelling  is  also  not  a 
particularly  good  method— it  will  always  underestimate  the  oxide  thickness  since  the 
tunnelling  will  occur  in  the  thinner  spots  on  the  gate.  Additionally,  one  would  rather  not 
stress  the  devices  while  trying  to  find  the  oxide  thickness.  One  recent  technique  for  ultra- 
thin  oxide  thickness  determination  is  using  quantum  oscillations  in  the  tunnelling  gate 
current,  which  are  caused  by  quantum  interference  of  electrons  in  the  oxide  conduction 
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band  [68].  This  method  potentially  suffers  from  the  same  problems  as  FN,  and 
additionally  requires  knowledge  of  the  effective  mass  and  oxide  barrier  height,  the  latter 
two  of  which  add  about  2.5A  of  error  to  the  results  [69],  assuming  they  are  known  to 
within  5%. 

Convergence  Speed-up  Details 

Iteration  stops  when  none  of  the  fitting  parameters  (i.e.  the  extracted  parameters) 
changes  by  more  than  5xl0"4%  between  successive  serial  cycles  (i.e.  each  parameter  was 
fit,  and  none  changed  by  more  than  5xl0~4%).  Below  are  several  of  the  methods 
employed  to  speed  up  this  convergence. 

The  calculations  involved  in  the  extraction  are  extremely  complex.  Of  greatest 
importance  is  the  convergence  speed  of  the  poly  LFCV  model,  which  itself  must 
converge  on  the  two  surface  potentials  just  to  give  one  C„  data  point  for  a  given  VG.  This 
one  data  point  is  used  in  the  numerical  partial  derivatives  for  the  non-linear  least-squares- 
fit,  which  means  the  Cg(VG)  is  called  twice  for  each  experimental  data  point  for  each 
iteration!  Because  this  model  is  called  so  frequently,  it  is  important  to  keep  track  of  all 
the  converged  surface  potentials  for  each  experimental  data  point  so  that  the  LFCV  model 
has  a  good  estimate  for  subsequent  calls. 

The  delta  used  for  the  numerical  partial  derivatives,  as  it  turns  out,  has  a  major 
influence  on  the  correctness  of  the  fit.  Since  two  of  the  parameters  vary  logarithmically 
(the  substrate  and  gate  doping),  the  actual  parameters  used  during  the  fit  are  the  logs  of 
these  parameters.  Thus,  the  deltas  used  for  the  derivatives  must  be  calculated  differently. 
Another  problem  is  that  the  delta  used  for  the  oxide  thickness  is  dependant  upon  the 
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actual  thickness  of  the  oxide  (that  is,  it  should  be  different  for  a  50  A  oxide  compared  to  a 
1000  A).  The  empirical  results  for  the  best  deltas,  as  determined  from  analysis  of  the 
numerical  derivatives,  were  A=10~7  for  ln(Nxx),  A=0. 1  for  ln(NGG),  A=10~4  for  VFB,  and 
10~7xTox  for  Tox.  For  example,  the  partial  derivative  of  Cg  with  respect  to  Tox  is 
calculated  from  (Cg(VG),  -  Cg(VG)2)/2A,  where  Cg(VG),  is  the  gate  capacitance 
calculated  at  some  VG  with  an  oxide  thickness  of  Tox  +  A  and  C„(VG)2  is  the  gate 
capacitance  calculated  at  the  same  V0  with  an  oxide  thickness  of  Tox  -  A.  The  reason 
arbitrarily  small  values  cannot  be  used,  of  course,  is  because  the  LFCV  model  itself  is 
only  accurate  to  about  eight  digits  (less  near  flatband)  due  to  its  own  internal  convergence 
criteria  [46]. 

Finally,  to  start  the  extraction,  a  reasonable  initial  guess  must  be  made.  The  initial 
guess  of  the  oxide  thickness  is  simply  Aeox/Cgmax,  where  Cgraax  is  the  maximum  gate 
capacitance  in  the  dataset  and  A  is  the  gate  area.  This  is  the  standard  first-order 
approximation  based  on  Cg  =  C0.  For  the  substrate  doping  initial  guess,  the  asymptotic 
high-frequency  CV  formula  for  Cgoo  [43]  is  solved  iteratively  using  the  minimum  and 
maximum  Cg  values  from  the  data  set.  The  gate  doping  is  simply  set  to  3xl019  cm-3. 
With  these  three  parameters  approximately  known,  the  flatband  voltage  is  estimated  from 
VGrain_data  -  VGmin_theory  since  tne  minimum  of  the  CV  data  is  not  necessarily  given,  but 
is  needed  internally  to  estimate  the  flatband,  the  minimum  three  data  points  (in  terms  of 
Cg)  are  used  to  estimate  the  true  Cg  minimum  based  on  the  parabolic  minimum  formula 
[70].  This  slightly  improves  the  convergence,  but  not  as  much  as  would  be  expected, 
largely  because  the  minimum  of  the  CV  curve  is  not  very  parabolic. 
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Because  it  was  clear  that  convergence  was  slower  as  the  results  approached  the 
final  values,  a  'trick'  was  developed  to  improve  this  end  case.  Whenever  a  trend  was 
visible  during  a  fit,  the  routine  doubled  the  amount  of  the  parameter  increase.  A  trend,  in 
this  case,  is  defined  as  three  successive  moves  of  a  parameter  in  the  same  direction  for  all 
the  parameters  (possibly  different  directions  for  different  parameters).  This  cut  down  the 
number  of  iterations  by  about  20%  in  most  cases. 

One  thing  which  would  have  improved  the  speed  of  convergence  greatly  would  be 
to  use  a  simpler  model  for  the  Fermi-Dirac  integral.  The  Cody-Thatcher  model  [71]  is 
extremely  accurate,  but  requires  the  quotient  of  ten  exponentials  from  a  Chebyshev 
approximation.  This  approximation  was  used  instead  of  some  other  simpler  (though  less 
accurate)  approximations  [72-76]  because  it  was  desired  to  add  as  little  error  as  possible 
from  the  Fermi-Dirac  integral  calculation. 


CHAPTER  4 

THE  EFFECT  OF  INTRINSIC  CAPACITANCE 

DEGRADATION  ON  CIRCUIT  PERFORMANCE 

Introduction 

In  this  chapter,  the  relatively  obscure  subject  of  intrinsic  capacitances  will  be 
discussed.  The  area  of  MOS  intrinsic  capacitance  has  received  little  attention  over  the 
years  due  to  the  difficulty  of  measurement  and  small  impact  relative  to  extrinsic 
capacitances  such  as  interconnect  and  packaging.  However,  as  the  push  toward  higher 
density  continues,  the  extrinsic  capacitance  is  being  reduced  as  much  as  possible  to 
improve  performance.  This  will  eventually  leave  the  intrinsic  capacitance  as  the  primary 
load  in  CMOS  circuits,  thus  making  this  a  topic  worth  studying  now.  After  a  discussion 
of  the  intrinsic  capacitances  which  most  effect  CMOS  circuits  (Cgd  and  C  ),  direct 
experimental  measurements  of  the  effect  of  hot-carrier  degradation  on  intrinsic 
capacitance  will  be  discussed,  and  the  results  modeled.  The  impact  of  this  degradation  on 
circuit  performance  will  be  evaluated  and  shown  to  offset  some  of  the  losses  due  to  ID 
degradation. 

Background 

The  effect  of  hot-carrier  degradation  on  the  drain  current,  ID,  has  been  studied 
intensely  since  Abbas's  initial  observation  in  1975  [77].  Another  intrinsic  property  of  a 
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MOS  transistor,  the  intrinsic  capacitance,  has  a  much  shorter  history  of  study  with  regard 
to  hot-carrier  degradation.  The  first  systematic  study  of  intrinsic  capacitances  was  done 
by  Sah  [13]  in  1964,  which  was  used  by  Meyer  in  1971  [78]  in  his  widely  referenced 
work.  In  his  paper  he  defines  the  intrinsic  capacitance  between  terminals  as: 
dQx 

dvy 
That  is,  the  change  in  the  charge  at  terminal  x  due  to  a  change  in  voltage  at  terminal  y. 
This  definition  applies  to  any  two-or-more  terminal  device,  but  from  now  on  will  be  used 
with  respect  to  a  4-terminal  MOS  transistor.  Thus,  it  is  clear  that  there  are  16  possible 
intrinsic  capacitance  terms  for  a  4-terminal  MOS  transistor.  Please  note  that  in  this 
small-signal  defination,  all  of  the  non-y  terminals  are  virtual  ground.  Thus,  dV  is 
referenced  to  ground  (i.e.  it  is  essentially  relative  to  all  the  other  terminals). 

At  first  thought,  one  might  assert  that  there  are  only  8  possible  capacitances 
since  Cxy=Cyx.  However,  this  is  not  true  because  our  definition  of  intrinsic  capacitances 
does  not  represent  static  capacitive  values  and  are  not  reciprocal.  Consider  the  two 
intrinsic  capacitances  Cgd  and  Cdg.  Neglecting  overlap  capacitance,  when  the  applied 
gate  voltage  is  less  than  the  gate  threshold  voltage,  VGT,  both  of  these  capacitances 
should  be  zero  (since  both  Cgd=dQg/dVd  and  Cdg=dQd/dVg  are  zero  due  to  no  existing 
channel).  Once  VGS  >  VGT  (and  VDS  <  VDSsal),  Cgd  and  Cdg  will  both  have  some  finite 
positive  value  when  the  channel  forms.  The  interesting  case  is  when  VGS  >  VGT  and  VDS 
>  VDSsar  Now  tnere  is  a  channel,  but  it  is  'pinched  off  near  the  drain  end.  Cd 
(dQd/dVg)  is  non-zero  since  a  change  in  the  gate  voltage  still  affects  the  charge  associated 
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with  the  drain  (Qd);  C  d  (dQ  /dVd)  is  zero  since  a  change  in  the  drain  voltage  has  no 
affect  on  the  gate  charge  since  the  drain  is  not  connected  to  the  channel  due  to  the  pinch- 
off.  As  clear  as  this  seems  now,  both  Meyer  [78]  and  others  [79]  assumed  that  the 
capacitances  should  be  reciprocal.  These  should  not  be  confused  with  the  small-signal 
circuit  element  terms,  which  are  named  the  same  way  but  actually  are  reciprocal  by 
definition. 

Ward  and  Dutton  [80]  were  the  first  to  argue  that  the  intrinsic  capacitances 
were,  in  fact,  non-reciprocal.  The  paper  also  stressed  the  importance  of  including  all  the 
capacitances,  particularly  the  gate  to  bulk  (C  b)  capacitance,  which  had  been  omitted  by 
Sah,  and  hence  Meyer.  Ward  and  Dutton's  charge-based  model  was  a  huge  improvement 
at  the  time,  as  Meyer's  model  does  not  guarantee  charge-conservation  in  circuit 
simulators  (due  to  omitting  Cgb),  resulting  in  erroneous  results  for  the  simplest  of  circuits. 

Papers  predating  Meyer's  work  largely  used  discrete  devices,  and  so  authors 
logically  argued  that  modeling  the  intrinsic  capacitances  would  be  useless  since  the 
capacitance  from  packaging  and  externa]  circuitry  would  be  vastly  larger  [81]. 
Furthermore,  there  was  no  direct  method  to  measure  the  data  to  verify  the  models.  With 
the  advent  of  integrated  circuits,  the  primary  capacitive  load  between  CMOS  circuit  cells 
(i.e.  an  nMOS  and  pMOS  inverter  pair)  became  dominated  by  the  intrinsic  and 
interconnect  capacitances,  rather  than  the  packaging  and  external  circuitry.  Thus, 
modeling  the  intrinsic  capacitance  (as  well  as  interconnect)  became  important. 

Integrated  circuits  also  hailed  the  need  for  compact  models  to  simulate  large 
numbers  of  transistors.  One  of  the  first  compact  models  was  CSIM  [82]  from  AT&T  Bell 
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labs.  Surprisingly,  the  authors  of  this  model  stayed  with  the  simple  Meyer  model, 
although  argued  that  including  the  intrinsic  capacitances  was  critical,  particularly  C  d. 
Cgd  accounts  for  most  of  the  intrinsic  capacitance  load  in  CMOS  circuits  due  to  the  Miller 
feedback  effect  [82],  Berkeley's  BSIM  [33]  built  upon  CSIM,  also  retained  the  Meyer 
model.  BSIM2,  however,  corrected  this  deficiency  by  including  a  non-reciprocal  intrinsic 
capacitance  model.  The  BSIM3  [34]  model  moved  from  an  strongly  empirical  d.c.  model 
to  a  more  physically-based  model,  but  retained  the  unaltered  a.c.  model  (including 
intrinsic  capacitances)  from  BSIM2,  suggesting  a  lag  in  a.c.  model  development. 

Current  a.c.  models  are  extremely  poor.  A  great  deal  of  additional  research  is 
needed  before  a.c.  models  become  nearly  as  sophisticated  as  d.c.  MOS  current  models. 
There  are  two  reasons  the  a.c.  models  are  so  far  behind  the  d.c.  models.  First,  intrinsic 
capacitance  data  have  only  been  available  since  the  early  1980s,  over  twenty  years  after 
the  first  MOS  transistor  ID  data.  Second,  until  recently,  external  capacitances  and 
interconnect  capacitances  dominated  the  total  capacitive  load,  making  the  intrinsic 
capacitance  fairly  unimportant.  However,  as  the  transistor  dimensions  have  decreased 
and  substantial  improvements  in  drain  current  density  become  difficult  due  to  physical 
limitations,  major  efforts  have  been  implemented  to  reduce  the  interconnect  capacitances, 
such  as  low-k  dielectrics.  This  has  increased  the  impact  of  intrinsic  capacitances  in 
overall  circuit  performance  and,  with  improved  interconnect,  could  become  the 
predominant  capacitive  load  in  the  circuit.  It  is  interesting  to  note  that  publications  on 
intrinsic  capacitance  modeling  have  been  increasing  year-to-year  since  Ward  and 
Dutton's  work  [83-90]. 
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Measurement  of  Intrinsic  Capacitances 

Because  direct  measurement  of  the  intrinsic  capacitance  is  difficult,  many  of 
the  first  measurements  were  done  with  on-chip  circuitry  using  reference  capacitors  [92- 
93]  or  op-amps  circuits  configured  as  coulombers  [94].  Eventually,  external  circuitry  was 
used,  including  a  lock-in  amplifier  connected  to  an  HP  4145  (as  a  voltage  source)  [95] 
and  later  an  off-the-rack  LCR  meter  [96],  such  as  the  HP  4275  A.  In  this  section,  the 
measurement  of  a  few  of  the  intrinsic  capacitances  will  be  described.  These  can  be  done 
using  an  HP  4275  or  HP  4276  (same  equipment  with  different  a.c.  frequency  ranges),  or 
the  newer  HP  4284. 

The  first  discussion  of  using  an  LCR  meter  for  the  direct  measurement  of 
intrinsic  capacitances  was  written  by  K.  C.-K.  Weng  and  P.  Yang  in  1985  [96].  In  this 
letter,  many  of  the  important  problems  with  measuring  the  intrinsic  capacitances  were 
discussed.  The  main  problem  is  that  LCR  meters  are  not  designed  to  measure  intrinsic 
capacitances.  There  are  two  sets  of  terminals  on  the  LCR  meter:  High  and  Low.  The 
high  port  applies  the  d.c.  bias  as  well  as  the  superimposed  a.c.  test  signal.  The  low  port 
measures  the  resulting  small-signal  current.  From  the  magnitude  and  phase  difference  of 
the  current  relative  to  the  applied  small-signal  test  voltage,  the  capacitance  can  be  found. 
Unfortunately,  the  low  port  is  a  virtual  a.c.  and  d.c.  ground,  so  no  d.c.  bias  may  be 
applied  to  it.  To  measure  Cgd  (dQg/dVd),  the  high  port  is  attached  to  the  drain  (to  apply 
the  dVd)  while  the  low  port  is  attached  to  the  gate  (to  measure  the  dQg  via  the  small 
signal-current,  ig  times  dt).  If  Cgd  is  desired  as  a  function  of  VGS,  the  problem  becomes 
apparent:  How  can  VGS  be  ramped  if  the  gate  is  grounded? 
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The  only  solution,  of  course,  is  to  independently  bias  the  three  terminals  not 
connected  to  the  low  port,  as  shown  in  Figure  4.1  for  a  Cgd  measurement.  Thus,  two 
additional  power  supplies  are  required,  along  with  the  internal  d.c.  power  supply  in  the 
LCR  meter.  These  power  supplies  must  be  well  calibrated  with  one-another  to  ensurethat 
no  potential  difference  exists  between  them  when  the  same  voltage  is  programmed.  The 
burden  of  negotiating  the  polarities  of  the  theee  power  supplies,  once  worked  out,  can  be 
easily  programmed  into  an  automated  station.  As  an  example  of  the  polarity  problem, 
consider  the  following:  if  Cgd  at  VGS=2  V,  VDS=3V,  and  Vxs=0  (note:  the  device  is 
active,  with  a  current  flowing  from  the  drain  to  the  source,  unlike  standard  CV 
measurements,  where  the  source,  drain,  and  substrate  are  tied  together)  is  desired,  the 
source  and  substrate  can  be  biased  at  -2  V  and  the  drain  can  be  biased  at  1  V.  Since  the 
gate  is  virtual  ground  (VGS=0),  it  is  easy  to  verify  that  the  above  applied  voltages  give  the 
desired  potential  differences  (VGS,  VDS,  and  Vxs).  There  is  nothing  particularly  odd 
about  this  configuration  except  that  it  differs  from  the  traditional  C-V  measurements 
where  the  substrate  is  the  ground  reference  instead  of  the  gate. 

In  the  above  case  of  Cgd,  the  source  and  substrate  may  be  tied  together  to 
forego  one  of  the  power  supplies  in  Figure  4.1.  If  Vxs  not  equal  to  zero  was  required, 
however,  all  terminals  must  be  biased  independently.  Thus,  if  one  is  designing  a 
measurement  station  where  any  of  the  possible  intrinsic  capacitances  can  be  measured, 
three  power  supplies  (including  the  internal  one  of  the  LCR  meter)  are  necessary. 
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Fig.  4.1  Measurement  configuration  for  Cgd.  Requires  LCR  meter  with  internal 

d.c.  power  supply,  as  well  as  two  additional  external  d.c.  power 
supplies. 
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Measurement  Configurations 

Although  the  standard  textbook  MOS  device  is  symmetric  with  respect  to 
interchanging  the  source  and  drain,  production  devices  may  be  asymmetric.  This 
asymmetry  may  be  the  result  implant  shadowing,  drain  and/or  source  engineering,  or  hot- 
carrier-induced  degradation,  among  other  possibilities.  Implant  shadowing  is  an 
interesting  case,  as  it  may  result  in  the  gate/source  and  gate/drain  overlap  regions  being 
different  lengths,  as  shown  in  Figure  4.2.  While  the  resulting  ID  characteristics  are 
symmetric  (that  is,  the  ID  versus  VDS  characteristics  are  the  same  if  the  source  and  drain 
leads  are  swapped),  the  measured  Cgd  characteristics  (as  well  as  Cgs,  Cdg,  and  Cds)  are 
asymmetric.  This  occurs  because  the  measured  characteristics  include  the  constant 
overlap  component,  as  shown  in  the  following  simple  equation: 

c  —  c  +  c 

gd_measured  v'ov_drain  T  ^gd1 
The  Cov  drajn  term  is  composed  of  the  constant  overlap  of  the  gate  with  the  drain,  as  well 
as  an  inner  and  outer  fringe  component.  These  fringe  components  have  been  calculated 
theoretically  [97],  and  assuming  they  are  constant  as  a  function  of  gate  voltage  introduces 
negligible  error  [96].  The  value  of  the  measured  C  d  in  subthreshold  (where  Cgd  measured 
=  Cov_drajn)  has  been  used  to  estimate  the  length  of  the  gate-to-drain  overlap  region  [98], 
and  with  the  drawn  channel  length  know,  these  overlap  values  could  be  used  to  extract 
the  effective  channel  length. 

When  necessary,  the  'normal'  and  'reverse'  configurations  of  Cgd  and  C 
measurements  will  be  specified.  These  are  shown  in  Figure  4.3.  Cgd"™  or  C  d  norm  refers 
to  the  'normal'  measurement  mode,  where  the  high  port  is  applied  to  the  drain  for  a  C  d 
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Simplified  schematic  of  asymmetric  gate  overlap,  which  results  in 
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Fig.  4.3  Measurement  configurations  for  (A)  Cgd  in  normal  configuration  mode; 

(B)  Cgs  in  normal  configuration  mode;  (C)  Cgd  in  reverse  configuration 
mode;  and  (D)  Cgs  in  reverse  configuration  mode. 
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measurement.  Cgdv  or  Cgd_rev  refers  to  the  'reverse'  measurement  mode,  where  the  high 
port  is  applied  to  the  source  for  a  Cgd  measurement.  This  is  necessary  because,  for  short- 
channel  devices,  the  resulting  Cov  value  (where  Cov  is  Covdrain  or  Covsource)  can  become 
a  significant  fraction  of  the  total  effective  intrinsic  capacitance.  Although  perhaps  not 
obvious  now,  Cgs  =  Cgd  when  VDS=0.  However,  due  to  the  difference  in  Cov,  Cgsmeasured 
may  not  equal  Cgdmeasured.  Figure  4.4  shows  the  Cgs  and  Cgd  measurements  in  thenormal 
and  reverse  modes  for  a  20  x  20  |xm  device.  Figure  4.5  shows  the  same  measurements  on 
a  20  x  0.40  (j.m  device  (effective  channel  length  is  0.24  |xm).  Comparing  the  two  figures 
clearly  shows  the  negligible  impact  of  Cov  on  the  long-channel  device  Cgd  and  C 
characteristics  and  the  large  impact  on  the  short-channel  device.  In  both  cases,  the  C  d 
and  Cgs  values  are  almost  identical,  as  is  the  overlap-induced  difference  of  about  3  fF 
(This  3  fF  offset  is  not  visible  on  the  Ldrawn=20  |im  device  because  it  contributes  less  than 
2%  to  the  maximum  capacitance,  whereas  the  overlap  contributes  about  60%  of  the  total 
measured  maximum  capacitance  for  the  Ldrawn=0.40  p.m  device). 

Later  in  this  chapter,  the  results  of  channel  hot-carrier  stress  on  C  d  and  C 
will  be  shown.  Because  channel  hot-carrier  stress  is  inherently  asymmetric  (since  the 
damage  occurs  near  the  drain  edge),  it  is  necessary  to  lay  down  the  above  notation  for 
later  use. 

Sample  Measurements 

For  all  capacitance  measurements  in  this  chapter,  an  HP  4828A  LCR  meter 
was  used  with  a  small-signal  voltage  was  400MHz  at  60  mV  peak-to-peak.  These 
number  were  chosen  after  testing  a  wide  range  of  a.c.  signal  voltages  and  frequencies 
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_rev  -gs_norm>  »""  -gs.rev  versus  VGS  for  a  20x2°  l^m 

MOST  with  VDS=0.0.  Although  it  appears  that  all  four  curves  are  the 
same,  there  are  actually  two  sets  of  curves,  Cgdnorm/Cgsrev  and 
Cgd_rev/Cgsnorm  separated  by  3  fF.  Very  little  difference  is  seen  because 
the  overlap  capacitances  shift  is  much  less  than  then  the  peak  C„d  and 
C    values.  Compare  this  with  Fig.  4.5. 


38 

^^ 

36 

LI 

M— 

34 

zz* 

32 

' — . 

30 

w 

o 

28 

o3 

26 

TJ 

24 

D) 

o 

22 

S7 


I    I    I    I    I    I    I    I    I    I    I    I    I    I    I    I    I    I    I    I    I    I    I    I    I    I    I 


P  P 

v-'gd_revi  ^gsjorm 


gd_norm>  *-'gs_rev 

VnR=0.0  V 


20  x  0.40  urn 
nMOST 


20 

18 
-0.5       0.0        0.5         1.0        1.5        2.0        2.5 


'  '  ''  I  i  i  i  i  I  i  i  i  i  I  i  i  i  i  I  i  i  i  i  I  i  i  i 


Vgs  /(1  V) 


Fig.  4.5  Cgd_norm'  Cgd_rev  Cgs_norm-  and  Cgs_rev  versus  Vos  for  a  20x0.40  ^m 

MOST  with  VDS=0.0.  Roughly  3  fF  parallel  shift  of  Cgsnorm/Cgd_rev 
and  Cgsrev/Cgdnorm  is  due  to  a  difference  in  constant  overlap 
capacitance  between  the  source  and  drain. 


88 

to  obtain  the  most  accurate  results.  The  60  mV  signal  may  seem  a  little  large  to  those 
familiar  with  common  C-V  measurements,  where  25  mV  is  typically  used,  but  is  actually 
on  the  low  end  of  the  23  mV  to  400  mV  found  in  most  intrinsic  capacitance  papers  [95- 
96,98-106].  Frequencies  below  100  MHz  result  in  extremely  poor-resolution  (noisy) 
intrinsic  capacitance  data,  while  frequencies  above  500  MHz  begin  to  show 
markedreduction  due  to  series  resistance.  LCR-specific  settings  on  the  HP  4284A  were  a 
medium  integration  time  with  8-cycle  averaging. 

So  far  the  measurement  procedures  and  naming  conventions  of  intrinsic 
capacitance  have  been  discussed.  Figures  4.4  and  4.5  showed  sample  measurements  with 
VDS=0.  Although  this  is  the  typical  way  capacitances  are  measured,  the  ability  to 
measure  the  capacitance  of  active  devices,  where  VDS  >  0  when  VGS  >  VGT  (where  VGT 
is  the  threshold  voltage  at  which  an  inversion  channel  form  between  the  source  and 
drain),  is  important.  Why  is  this  capability  important?  Because  in  a  real  circuit,  this  will 
commonly  occur.  If  a  correct  model  for  the  behavior  of  an  operating  transistor  is  desired, 
then  data  from  an  active  device  is  required.  Indeed,  without  this  data,  it  would  be  like 
trying  to  verify  an  IDsat  model  with  data  only  taken  in  subthreshold! 

Examples  of  Cgd  measurements  on  active  devices  are  shown  in  Figures  4.6 
and  4.7  for  20  x  20  urn  and  20  x  0.40  urn  as  a  function  of  VGS  for  VDS  =  0.0,  0.5,  and  1.0 
V  (Vsx  =  0.0V).  Cgd  transitions  from  Covdrain  to  a  larger  value  once  VDS  <  VDSsat,  or  the 
channel  is  no  longer  pinched-off.  From  a  charge  perspective,  this  means  changes  in  VDS 
(dVd)  cause  changes  in  Qchannd,  which  in  turn  cause  changes  in  Qg(dQg),  resulting  in  a 
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Fig.  4.6  Cgd  versus  VGS  for  a  20  x  20  ^m  MOST  with  VDS=0.0,  0.5,  and  1 .0  V. 
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Cgd.  Thus,  as  VDS  increases,  the  point  at  which  this  transition  occurs  also  increases,  as 
can  be  seen  in  the  figures. 

As  mentioned  previously,  Cgd  is  the  most  important  intrinsic  capacitance 
because,  in  a  common-source  configuration  (which  is  the  configuration  for  all  CMOS 
circuits),  the  effective  load  is  2(Cgs  +  Cg(](l  -  Av)),  where  Av  is  the  gain  between  the  gate 
input  and  drain  output  (a  large  negative  number). 

The  next  most  important  capacitance,  based  on  the  above  load  formula,  is  C  . 
Figures  4.8  and  4.9  show  both  Cgs  and  Cgd  for  a  20  x  20  urn  and  20  x  0.40  \im  as  a 
function  of  VGS  for  VDS  =  0.0,  0.5,  and  1 .0  V.  Unlike  Cgd,  Cgs  will  have  a  finite  value  as 
long  as  VGS  is  greater  than  VGT,  since  the  channel  will  always  be  connected  to  the  source. 
At  VDS=0.  Cgs=cgd  since  the  channel  charge  is  equally  controlled  by  the  source  and 
drain.  However,  if  VGS  >  VGT  (channel  forms)  and  VDS  >  VDSsat  (drain  pinched  off), 
then  the  source  terminal  will  actually  control  more  than  half  of  the  channel  charge, 
resulting  in  a  rise  in  Cgs  above  the  value  at  VDS=0.  However,  once  VGS  increases  to  a 
point  that  VDS  <  VDSsal,  the  drain  is  no  longer  pinched  off,  and  the  Cgs  value  begins  to 
decline  with  increasing  VGS  as  Cgd  increases  rapidly.  This  is  clearly  demonstrated  in 
Figure  4.8  (and  to  a  lesser  extent  in  4.9),  where  the  decline  in  C  corresponds  to  the 
increase  in  Cgd.  The  model  for  Cgd  and  Cgs  will  be  discussed  later.  Recalling  the 
discussion  about  the  overlap-capacitance  shifting  in  the  previous  section,  the  capacitances 
shows  in  4.8  and  4.9  are  actually  C^rm  and  Cgf  in  order  to  offset  the  effects  of  the 
overlap  capacitance.  (Fig.  4.5  shows  why  this  was  necessary) 
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Fig.  4.8  Cgd  and  Cgs  versus  VGS  for  a  20  x  20  ^m  MOST  with  VDS=0.0,  0.5,  and 

1.0  V. 
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Channel  Hot-Carrier  Stress  Effects  on  C  d  and  C„s 

Because  the  intrinsic  capacitances  are  somewhat  difficult  to  measure,  as  well 
as  the  relatively  small  contribution  of  intrinsic  capacitance  on  circuit  performance  in  past 
generations,  very  little  work  has  been  done  to  investigate  the  impact  of  hot-carrier  stress 
on  intrinsic  capacitance.  Although  the  first  report  of  hot-carrier  degradation  on  ID  was 
published  in  1975  by  Abbas  and  Dockerty  [1],  the  first  investigation  of  Cgd  and  C 
degradation  was  not  published  until  1988  by  Yao,  Peckerar,  Friedman,  and  Hughes  [107]. 
Since  then  there  have  been  several  papers  [102-106]  by  two  research  groups  showing  C  d 
and  Cgs  degradation  for  various  stress  conditions.  Only  one  paper,  by  Dai,  Walstra,  and 
Lee,  [108]  showed  the  impact  of  Cgd  and  Cgs  degradation  on  circuit  performance.  This 
section  will  present  those  data,  a  model  for  the  degradation  [109],  and  additional 
supplementary  information  not  released  in  that  short  paper. 

Transistors  from  a  0.35  urn  CMOS  technology  for  2.5  V  operation  were  used; 
the  same  devices  shown  throughout  this  chapter.  Drawn  channel  lengths  were  0.40  um 
and  0.48  am,  with  effective  channel  lengths  of  0.24um  and  0.32um  respectively. 
Accelerated  stress  was  performed  using  the  following  procedure: 

1)  Take  unstressed  ('fresh')  ID  versus  VDS  data  from  0  to  2.5  V  at  VGS=2.5,  2.0,  1.5, 
and  1.2  V. 

2)  Take  'fresh'  Cgs  (normal  mode)  for  reference. 

3)  Take  Cgd  (normal  mode)  versus  Vcs  from  0  to  2.5V  at  VDS=0.0,  0.5,  and  1 .0  V. 

4)  Without  re-probing,  stress  for  exponentially  longer  times  (see  next  paragraph  for 
stress  conditions),  followed  by  capacitance  measurements  as  in  (3). 
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5)  After  the  final  stressed  C  d  measurement,  take  C  measurement  (normal  mode) 
and  then  measure  the  final  ('stressed')  ID  versus  VDS  as  in  (1). 

The  accelerated  stress  conditions  were  VGS  =  1  V,  VDS  =  4  V  for  nMOS 
devices  and  VGS  =  -1  V,  VDS  =  -4  V  for  pMOS  devices,  for  a  total  stress  time  of  14.6  hr 
(twelve  C  d/C  measurements  total).  Hot  carrier  stressing  is  a  complicated  and  much- 
debated  topic.  Although  it  is  possible  that  some  degradation  mechanisms  may  occur  at 
these  high-voltage  stress  conditions  which  could  never  occur  during  normal  operation,  it 
is  believed  that  this  accelerated  degradation  of  the  Cgd  data  will  still  be  indicative  of  what 
will  occur  over  long-time  operation  at  normal  operating  voltages.  Forward-biasing  the 
source  to  increase  the  drain  current  without  greatly  changing  the  drain-field  profile  could 
be  used  [1 10],  but  was  not  considered  at  the  time  the  measurements  were  made  and  is  not 
yet  accepted  practice  at  Intel  Corporation,  where  these  measurements  were  taken. 

In  steps  (1)  and  (5),  the  ID  data  were  measured  on  a  different  apparatus  (an 
automated  prober).  This  is  acceptable  since  there  will  be  negligible  measurement  error 
from  re-probing  and  measuring  the  ID  curves.  However,  in  steps  (3)  and  (4),  it  is  very 
important  that  the  stress  be  performed  without  reprobing  in  a  shielded  probe  box, 
preferably  with  the  capacitance-measuring  probes  allowing  the  force  and  sense  lines  from 
the  LCR  meter  high  and  low  ports  to  connect  right  at  the  probe  tip  (i.e.  at  the  transistor 
pads).  After  calibrating  (zeroing)  the  LCR  meter  to  account  for  the  probe  configuration 
capacitance,  any  additional  reprobing  can  easily  add  several  femtofarad  to  the  measured 
capacitance,  which  is  on  the  order  of  the  degradation  amounts  (shown  later).  Although  it 
is  possible  to  integrate  the  IV  measurements  into  the  circuit,  it  is  advisable  to  add  as  little 
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additional  circuitry  as  possible  due  to  the  exceptionally  low  capacitance  values  being 
measured. 

Figure  4.1  shows  the  equipment  set-up  to  measure  the  Cgd  during  stress.  The 
previous  section  discussed  the  a.c.  signal  and  frequency  settings  used  for  the 
measurements.  Although  C"v  could  be  monitored,  the  actual  time  to  measure  the 
intrinsic  capacitance  curves  is  about  five  minutes,  which  would  add  an  extra  hour  over 
the  whole  stress  time.  Furthermore,  C'f,  although  interesting,  is  not  a  component  which 
comes  into  play  in  actual  circuit  operation.  Instead,  the  much  more  important  intrinsic 
capacitance,  C"°rm,  is  measured  before  and  after  the  stress,  but  not  in  situ  because  that 
would  require  either  a  manual  reprobing,  which  is  prohibitively  long,  or  a  switching 
matrix,  which  would  could  not  be  zeroed  out  properly  with  the  LCR  meter.  An 
interesting  idea  for  a  new  piece  of  equipment  would  be  an  LCR  meter  which  allows 
several  short  and  shunt  zeros  to  be  stored  in  the  LCR  meter's  memory.  This  way  the 
equipment  could  be  zeroed  through  different  configurations  of  the  circuit  and  the 
software  could  then  tell  the  LCR  which  particular  zero  'set'  to  use  before  switching  the 
circuit  over. 

Using  the  methodology  outlined  above,  nMOS  and  pMOS  devices  were 
stressed  for  exponentially  increasing  time  spans  between  Cgd  measurements.  The  total 
stress  time  was  14.6  hr,  which  when  added  to  the  measurement  time  of  the  in  situ  C„d 
measurements,  is  the  length  of  time  between  the  end  of  the  work  day  and  the  beginning  of 
the  next. 
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Figure  4. 10  shows  the  results  of  this  hot-carrier  stress  on  Cgd  at  VDS  =  0.0  V 
for  a  20  x  0.40  |im  device  at  each  time  time  interval.  Figure  4.1  1  shows  the  same 
situation  for  a  20  x  0.48  u,m  device.  In  both  cases,  it  is  clear  that  for  VGS  >  0.4  V,  Cgd 
decreases  with  increasing  stress  time.  The  longer-channel  (0.48  |im  drawn)  device 
exhibits  less  degradation  simply  because  the  drain  current  during  stress  is  also  smaller 
due  to  the  longer  channel  length  (since,  to  the  first  order,  the  drain  current  is  proportional 
to  1/L).  Because  the  degradation  is  due  to  interface  trap  generation,  as  discussed  in  the 
next  section,  the  smaller  the  current  results  in  a  smaller  fluence,  and  thus,  fewer  generated 
holes  resulting  in  fewer  interface  traps.  Also  notable  in  Figures  4.10  and  4.1 1  is  an 
increase  in  C  d  for  VGS  <  0.3  V. 

Intrinsic  Capacitance  Degradation  Model 

Both  the  reduction  in  C  d  for  VGS  >  0.4  V  and  the  increase  for  VGS  <  0.2  V 
can  be  explained  from  by  simple  model  [109].  The  gate-to-drain  capacitance  is  given  by 
the  following  integral: 

W«C0        rL 

Cgd  =  vac(x)dx,  (4.1) 

Va=  0 

where  vac  is  the  applied  a.c.  test  signal.  In  the  absence  of  a  non-uniform  charge  density  in 
the  gate  or  at  the  gate  interface  (i.e.  with  a  spatially  constant  Q0T  and  Qn-),  the  applied 
test  signal  used  to  measure  Cgd  should  fall  uniformly  across  the  channel,  as  shown  in  the 
straight  "unstressed"  curve  of  Figure  4. 12.  Put  simply,  the  applied  signal  controls  all  the 
charge  at  the  drain  edge  and  progressively  less  as  the  signal  drops  across  the  channel.  By 
Gauss's  law,  the  charge  in  the  channel  must  be  balanced  out  by  charge  on  the  gate, 
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Fig.  4.10  Cgd  versus  VGS  for  a  20  x  0.40  urn  MOST  with  VDS=0.0  after  hot- 

carrier  stress  at  VGS=1.0  V  and  VDS=4.0  V.  The  different  curves 
represent  measurements  taken  during  the  stress  and  demonstrate  a 
reduction  in  Cgd  as  a  function  of  stress  time.  Last  curve  is  52440 
seconds,  or  14.6  hr  of  stress  time. 
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Fig.  4.1 1  Cgd  versus  VGS  for  a  20  x  0.48  urn  nMOST  with  VDS=0.0  after  hot- 

carrier  stress  at  VGS=1.0  V  and  VDS=4.0  V.  The  different  curves 
represent  measurements  taken  during  the  stress  and  demonstrate  a 
reduction  in  Cgd  as  a  function  of  stress  time.  Last  curve  is  52440 
seconds,  or  14.6  hours  of  stress  time.  Less  degradation  is  seen  for  this 
longer-channel  device  compared  to  Fig.  4. 10  simply  because  the  current 
is  lower;  hence  the  fluence  is  lower  for  the  same  amount  of  time, 
resulting  in  less  interface  damage. 
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Fig.  4. 1 2  Idealized  picture  of  a  MOS  transistor  with  trapped  positive  charge  and 

interface  traps  near  the  drain  edge.  Diagram  below  shows  the  drop  of 
the  a.c.  test  signal  across  the  channel,  and  how  it  is  affected  by  the 
trapped  charge  near  the  drain  for  weak-inversion  and  inversion  regions. 
The  area  under  the  curve  is  proportional  to  C  d,  as  shown  in  Eq.  4.1. 
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with  the  amount  of  charge  at  any  location  'x'  being  vac(x)»C0*W»dx.  This  gives  the 
convenient  asymptotic  expression  of  Cgd='/2(WL)C0  for  Cgd  in  strong  inversion  (with  no 
trapped  charge). 

During  stress,  interface  traps  are  generated  by  hot-hole  dehydronization  of 
interface  Si-H  bonds  near  the  MOS  drain  edge,  as  discussed  by  Sah  [2].  When  the 
channel  is  strongly  inverted,  and  all  these  interface  traps  are  filled,  the  local  threshold 
voltage  near  the  drain  will  always  be  higher  than  the  rest  of  the  channel  due  to  the  filled 
interface  traps  offsetting  the  applied  gate  voltage.  Thus,  the  conductance  near  the  drain 
edge  is  lower  for  stressed  devices.  This  means  the  vac(x)  in  (4.1)  will  drop  more  rapidly 
in  the  damaged  drain  region  than  in  the  rest  of  the  channel,  resulting  in  less  total  charge 
controlled  by  the  drain  and  a  lower  Cgd  value.  This  is  clearly  depicted  in  the  "post-stress, 
inversion"  curve  of  Figure  4.12,  where  the  area  under  the  curve  (the  value  of  the  integral) 
is  easily  seen  to  be  less  than  the  unstressed  case. 

The  only  remaining  question  is  the  slight  increase  in  Cgd  at  the  lower  gate 
voltages.  At  these  voltages,  most  of  the  capacitance  is  due  to  overlap.  During  stress,  in 
additional  to  the  interface  trap  generation,  there  may  be  some  hole  trapping  in  the  oxide 
(from  hot  holes,  generated  in  the  depletion  layer,  injected  over  the  Si02  barrier  into  the 
oxide).  This  adds  a  small  amount  of  positive  charge  near  the  drain,  which  increases  the 
conductivity  near  the  drain,  and  hence  increases  the  overall  Cgd  value,  as  shown 
pictorially  in  Figure  4.12  as  "post-stress,  weak  inversion."  Once  VGS  increases  into 
inversion,  however,  the  negative  interface  charges  compensate  and  then  exceed  the  small 
effect  of  trapped  positive  charge.  In  the  case  of  the  Cgd  increase  due  to  trapped  holes,  it  is 
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likely  that  a  weak  channel  forms  near  the  drain,  but  does  not  extend  to  the  source.  This 
channel,  which  responds  to  changes  in  the  a.c.  test  signal,  results  in  the  small  Cgd  increase 
with  stress. 

As  mentioned  previously,  Cgs  is  the  second  most  important  capacitance  in 
terms  of  circuit  performance  after  Cgd.  Figure  4. 13  shows  the  initial  and  final  C  values 
for  a  20  x  0.40  |xm  device.  Cgs  increases  with  stress,  for  precisely  the  same  reason  as  C„d 
decreases.  That  is,  the  localized  negative  charge  caused  by  the  interface  traps  results  in 
decreased  conductivity  near  the  drain  edge,  which  causes  a  larger  overall  value  of  vac 
across  the  channel.  In  the  charge-control  sense,  more  of  the  channel  charge  is  controlled 
by  the  source  (and,  consequently,  less  is  controlled  by  the  drain,  as  has  been  already 
seen).  The  formula  is  identical  to  (4.1),  with  the  limits  swapped  (of  course,  vac  is  now 
maximum  at  the  source  end  and  zero  and  the  drain).  Because  of  Miller  multiplication, 
the  effect  on  the  total  capacitive  load  due  to  the  Cgs  increase  after  stress  is  much  less  than 
the  Cgd  decrease,  so  the  overall  load  decreases  due  to  stress. 

Figure  4. 14  shows  the  results  of  stress  on  a  pMOS  device.  As  would  be 
expected,  there  is  considerably  less  degradation  due  to  the  decreased  current  caused  by 
the  lower  hole  mobility.  This  lower  mobility  essentially  results  in  fewer  hot-holes,  and 
consequently,  less  interface  trap  generation.  It  would  appear  that  after  some  initial  weak 
bond  breaking,  very  little  additional  degradation  occurs.  This  is  also  seen  in  pMOS  drain 
current  degradation,  which  is  always  less  than  the  nMOS  equivalent.  Due  to  the  results 
shown  in  Figure  4.14,  the  affect  of  degradation  on  p-channel  devices  will  be  ignored. 
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Fig.  4. 1 3  Cgs  versus  VGS  at  VDS=0.0,  0.5,  and  1 .0  V  before  and  after  CHE  stress 

on  a  20  x  0.40  |lm  nMOST  at  stressed  at  VGS=- 1 .0  V  and  VDS=-4.0  V 
for  14.6  hr. 
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Fig.  4.14  Cgd  degradation  for  0.40  |lm  pMOST  after  stress  at  VGS=-1.0  V  and 

VDS=-4.0  V  for  3.0  hr.  Negligible  change  after  the  first  10  s  of  stress. 
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Degraded  Circuit  Simulation 

The  degradation  of  Cgd  has  been  demonstrated  before  [102-107].  This 
information  on  its  own  is  interesting,  but  it  is  more  important  to  assess  the  impact  of  this 
degradation  on  circuit  performance.  Otherwise,  all  that  has  been  done  is  measurement 
without  analysis.  In  this  section,  the  effect  of  intrinsic  capacitance  circuit  performance 
will  be  discussed  by  simulating  degraded  transistors  in  a  ring  oscillator.  This  will  be 
compared  to  the  standard  methodology  of  only  modeling  the  IV  degradation  without 
accounting  for  the  intrinsic  capacitance  degradation. 

Before  circuits  can  be  simulated,  individual  transistors  must  be  simulated. 
This  is  done  by  fitting  experimental  data  to  a  device  model.  For  the  work  in  this 
dissertation,  the  BSIM3  model  [34]  was  used,  with  some  slight  extensions  made  to  the 
a.c.  model  to  improve  convergence.  The  BSIM3  model  (version  3.3)  is  the  current 
accepted  SEMATECH  industry  standard. 

The  initial  and  final  ID-VDS  data,  shown  in  Figure  4.15,  were  fit  to  the  BSIM3 
d.c.  model  using  appropriate  parameters.  The  resulting  parameters  were  saved  in  a 
parameter  set;  these  parameter  sets  are  later  used  when  simulating  the  circuit.  Thus,  the 
initial  and  final  IV  curves  were  individually  fit  and  the  resulting  parameter  sets  were 
saved.  These  will  be  referred  to  as  the  'fresh'  and  'stressed'  IV  sets,  respectively. 

The  a.c.  model  used  in  BSIM3  is  much  less  sophisticated  than  the  d.c.  model, 
and  there  is  no  automated  extraction  methodology  for  it.  Thus,  fitting  C  d  and  C  ,  had  to 
be  done  manually.  The  data  fit  was  the  VDS=0.0  data  in  Figures  4.10  and  4.13,  again 
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resulting  in  'fresh'  and  'stressed'  CV  sets  for  the  t=0  (unstressed)  data  and  the  t=14.6  hr 
(stressed)  data  respectively.  This  allows  us  to  compare  the  following  three  scenarios: 

(1)  Circuit  performance  using  fresh  IV  and  fresh  CV  parameter  sets.  This  will  give 
the  performance  of  an  unstressed  circuit. 

(2)  Circuit  performance  using  stressed  IV  and  fresh  CV  parameter  sets,  as  done 
currently  in  industry.  This  will  give  the  performance  of  a  stressed  circuit  without 
including  intrinsic  capacitance  changes. 

(3)  Circuit  performance  using  stressed  IV  and  stressed  CV  parameters  sets,  which 
compared  to  (2),  will  show  us  the  effect  of  including  the  C  d  degradation  (and  C 
enhancement). 

To  compare  these,  a  simple  ring  oscillator  circuit  will  be  used,  which  is  comprised  of 
some  odd-number  of  CMOS  inverters  chained  together.  Figure  4.16  shows  an  example 
of  a  31-stage  ring  oscillator,  along  with  the  individual  CMOS  inverter  pair  circuit. 
Consider  a  voltage  Vcc  applied  to  nMOS  and  pMOS  gates-this  will  cause  the  nMOS 
device  to  turn  on  and  the  pMOS  device  to  turn  off,  which  results  in  ground  (0  V) 
appearing  at  the  drain  lead  of  the  nMOS  device.  Consider  a  voltage  of  ground  (0  V) 
applied  to  the  nMOS  and  pMOS  gates-this  will  cause  the  pMOS  device  to  turn  on  and 
the  nMOS  device  to  turn  off,  which  results  in  Vcc  appearing  at  the  drain  lead  of  the 
pMOS  device.  Thus,  the  CMOS  circuit  is  an  inverter  since  the  voltage  applied  to  the  gate 
is  inverted  (in  the  logic  sense  of  the  term)  at  the  output.  Any  odd  connection  of  inverters 
will  result  in  the  voltage  oscillating  from  high  to  low  around  the  chain.  This  oscillation 
frequency  in  often  used  to  demonstrate  the  impact  of  ID  degradation  on  circuit 
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Fig.  4. 1 5  Initial  and  final  ID  versus  VDS  characteristics  for  a  20  x  0.40  |xm  MOST. 

Stress  condition  was  VGS=1.0V  and  VDS=4.0V  for  14.6  hr.  From  these 
data  come  the  'fresh'  (initial)  and  'stressed'  (final)  parameters  sets. 
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Fig.  4.16  Ring  oscillator.  (A)  CMOS  inverter  as  pMOST  and  nMOST  circuit, 

abbreviated  by  the  logic  symbol;  (B)  3 1-stage  ring  oscillator  with  fanout 
of  three,  composed  of  CMOS  inverters  and  ideal  (no  capacitance  or 
resistance)  interconnect. 
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performance  [43],  because  as  ID  drops  due  to  degradation,  it  takes  longer  for  a  switching 
transistor  to  charge  up  the  intrinsic  capacitance  of  the  next  inverter  (as  well  as  the 
interconnect),  which  slows  down  the  overall  ring  oscillation  frequency. 

Using  the  three  parameter  sets  discussed  above,  which  were  derived  from  the 
data  already  presented,  three  sets  of  ring  oscillator  waveforms,  as  shown  in  Figure  4.17, 
can  be  found.  These  represent  the  output  node  voltage  of  any  given  CMOS  inverter  in  a 
31 -stage  ring  with  a  fanout  of  3  (each  output  node  connected  to  three  CMOS  inverter 
inputs),  and  are  simulated  using  Intel's  SPICE-like  circuit  simulator.  The  fanout  is  used 
to  simulate  a  typical  circuit,  where  one  transistor  drives  multiple  down-stream  transistors. 
These  addition  transistors  obviously  add  to  the  load.  Interconnect  capacitance  is 
neglected  since  its  impact  is  layout-dependent. 

The  resulting  oscillation  frequencies  for  the  Ldrawn=0.40  \im  ring  were  as 

follows  for  each  parameter  set: 

Fresh  IV,  Fresh  CV  85.6  MHz 

Stressed  IV,  Fresh  CV  80.2  MHz 

Stressed  IV,  Stress  CV  82. 1  MHz 

The  above  data  clearly  demonstrate  that  by  including  the  stressed  intrinsic  capacitance, 

some  of  the  ID  degradation  is  offset,  resulting  in  a  higher  post-stress  operating  frequency. 

This  is  simply  because  the  capacitive  load,  which  the  ID  must  drive,  is  degrading 

simultaneously  with  ID.  Note  that  the  ID  degradation  in  one  device  is  being  offset  by  the 

Cgd  degradation  in  other  devices  in  the  next  stage. 

The  difference  between  the  above  examples  may  look  quite  small  (80.2  MHz 

for  the  normal  IV-only  degradation  set  versus  82. 1  MHz  for  our  IV  and  CV  degradation 
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set).  To  put  it  into  perspective,  consider  the  following  example.  Suppose  one  is 
designing  an  85  MHz  processor  with  a  critical  31 -transistor  path  which  limits  the 
maximum  frequency.  Furthermore,  assume  that  the  accelerated  stress  at  14.6  hr 
represents  exactly  10  years  of  normal  operation,  which  is  the  specification  required  for 
the  85  MHz  processor.  Finally,  assume  that  this  processor  is  required  to  remain  within 
5%  of  85  MHz  during  its  10-year  life  (80.75  MHz  to  89.25  MHz).  Figure  4.18  shows 
pictorially  what  can  be  deduced  from  the  table  above,  namely  that  the  "Stressed  IV,  Fresh 
CV"  set  (industry  normal  methodology)  will  result  in  a  predicted  failure,  since  the 
resulting  simulated  frequency  of  80.2  MHz  is  less  that  the  80.75  MHz  guardband. 
However,  when  the  "Stressed  IV,  Stressed  CV"  is  used,  the  resulting  82. 1  MHz  simulated 
frequency  is  well  within  the  guardband  range,  preventing  unnecessary  redesign  and/or 
scrap  (actually,  the  processor  would  probably  be  sold  as  a  slower  version  at  lower 
margin). 

Conclusion 

This  chapter  examined  the  effect  of  channel  hot-carrier  stress  on  the  two  main 
intrinsic  capacitances  in  a  common-source  MOST  CMOS  circuit:  C  d  and  C  From 
measurement  of  these  curves  before  and  after  stress,  along  with  the  ID  characteristics, 
fresh  and  stressed  CMOS  inverters  were  simulated,  and  the  effect  of  stress  on  a  CMOS- 
based  ring  oscillator  was  demonstrated.  It  was  clearly  shown  that  the  inclusion  of  C  d 
degradation  offsets  the  well-known  ID  degradation  by  reducing  the  capacitive  load  the 
drain  current  must  drive.  It  is  important  to  note  that  the  interconnect  capacitance  was 
ignored  in  this  case,  although  it  is  quite  large  in  reality.  As  transistors  scale  smaller, 
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however,  it  is  predicted  by  the  SIA  roadmap  [111]  that  low-k  dielectrics  and  lower 
resistivity  interconnects  will  be  used  in  an  effort  to  reduce  the  RC  delay  from 
interconnect.  As  the  interconnect  capacitance  is  reduced,  the  intrinsic  capacitance 
becomes  more  significant.  The  exact  important  of  one  over  the  other  is  a  function  of 
layout,  and  cannot  be  easily  assessed.  However,  it  is  clear  that  the  importance  of  intrinsic 
capacitances  on  circuit  performance  will  only  increase  as  efforts  are  made  to  reduce  all 
extrinsic  capacitance  factors. 
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Fig.  4. 1 7  Beginning  few  cycles  of  a  3 1  -stage  ring  oscillator  circuit  using  three 

different  parameter  sets.  The  highest  frequency  curve  (unstressed  IV, 
unstressed  CV)  expectedly  comes  from  the  simulation  using  the  two 
unstressed  IV  and  CV  parameter  sets.  The  slowest  frequency  curve 
(stressed  IV,  unstressed  CV)  comes  from  the  simulation  using  only  the 
stressed  IV,  while  the  middle  frequency  curve  (stressed  IV,  stressed 
CV)  comes  from  using  both  the  stressed  IV  and  CV  parameter  sets. 
This  demonstrates  that  inclusion  of  Cgd  degradation  results  in  circuit 
performance  improvement  due  to  offsetting  some  of  the  ID  degradation. 
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Fig.  4.18  Example  demonstrating  how  including  intrinsic  capacitance  can  result 

in  substantial  benefits.  Here  it  is  assumed  that  the  accelerated  stress  of 
14.6  hr  is  equivalent  to  10  yr  of  operation  for  a  fictional  85  MHz 
processor  with  a  ±  5%  allowed  frequency  deviation.  By  including  Cgd 
degradation,  the  processor  performs  with-in  specification,  while  by  not 
including  the  intrinsic  capacitance  degradation  (as  is  normally  done), 
the  processor  is  estimated  to  fail,  possibly  resulting  unnecessary 
redesign/scrap. 


CHAPTER  5 
SUMMARY  AND  CONCLUSION 

As  MOS  transistors  scale  smaller,  previously  unimportant  or  avoidable 
problems  such  as  channel-length  modulation,  polysilicon  gate  depletion,  and  intrinsic 
capacitance  degradation  become  significant.  In  the  previous  chapters,  each  of  these 
scaling-related  issues  was  discussed  and  the  ramifications  of  the  problems  were 
demonstrated.  In  the  first  two  cases,  the  problem  was  accounted  for  by  extending 
previous  theory  to  accommodate  it.  In  the  later  case,  the  previous  degraded-circuit 
simulation  methodology  was  extended  to  show  the  unexpected  benefits  of  including 
intrinsic  capacitance  degradation  in  circuit  simulations. 

The  history  and  derivations  of  the  prominent  long-channel  current  model  was 
introduced  so  that  the  pros  and  cons  of  each  could  be  discussed.  The  Pao-Sah  current,  by 
explicitly  taking  drift  and  diffusion  into  account,  was  shown  to  be  the  most  accurate  long- 
channel  model,  while  the  simpler  charge-sheet  model  was  shown  to  be  nearly  as  good. 
Because  these  long-channel  models  do  not  take  the  drain  encroachment  into  account,  they 
need  to  be  extended  to  be  useful  in  today's  short-channel  regime.  Although  the  depletion 
region  near  the  drain  is  2-dimensional,  the  1-D  Pao-Sah  model  was  extended  to  include 
the  channel-shortening  effect  by  dividing  the  MOS  channel  into  two  sections:  an  ideal  1- 
D  long-channel  portion  and  a  1-D  drain  depletion  region  to  account  for  the  channel-length 
modulation  effect.  The  ideal  long-channel  portion  used  the  Pao-Sah  current  model  to 
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calculate  the  current,  and  three  methods  were  proposed  and  implemented  to  find  the 
boundary  potential  between  the  two  sections.  The  most  complicated  method  of  matching 
the  longitudinal  fields  was  shown  to  be  the  only  one  capable  of  demonstrating  smooth 
transitions  in  both  the  drain  current  and  drain  conductance  at  the  point  where  the  drain 
voltage  exceeds  VDSsat  and  the  drain  space-charge  layer  thickens.  The  other  two 
methods,  'saturation  voltage'  and  'surface  potential  self-saturation,'  both  showed  the 
expected  channel-length-modulation-induced  saturation  current  increase  as  the  channel 
length  (for  a  square  device)  decreases,  but  the  transition  point  near  VDSsat  in  the  ID  versus 
VDS  plot  was  abrupt  enough  to  cause  discontinuities  in  the  drain  conductance. 

The  effect  of  polysilicon  gate  depletion  on  the  MOS  LFCV  characteristics  was 
demonstrated  using  a  Fermi-Dirac  based  model.  It  was  shown  that,  as  the  oxide  thickness 
decreases,  the  effect  of  polysilicon  depletion  become  increasingly  pronounced.  The 
purpose  of  thinning  the  gate  oxide  is  to  increase  the  carrier  concentration  for  a  given 
applied  voltage,  but  polydepletion  offsets  an  increasingly  large  portion  of  this  gain,  as 
does  the  Fermi-Dirac  distribution  (compared  to  the  Boltzmann  distribution).  With  this 
polysilicon-gate  LFCV  model,  it  was  shown  that  the  oxide  thickness,  flatband  voltage, 
and  gate  and  substrate  doping  concentrations  could  be  extracted  from  experimental 
capacitance  data.  Two  extraction  methods,  the  3-point  and  3-region,  were  developed  and 
were  shown  to  work  well  with  130A  (2.7%  RMS  fit)  and  sub  30A  (10%  RMS  fit)  data. 
Quantum  effects  were  neglected  because  it  is  believed  that  thermal  broadening,  surface 
roughness,  and  non-random  dopant  distributions  will  all  cause  the  localized  states  to 
broaden  into  a  continuum.  Details  about  the  poly  LFCV  model  were  also  discussed. 
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Intrinsic  capacitance  was  predicted  to  become  an  increasingly  large  part  of  the 
capacitive  load  as  the  extrinsic  capacitances,  predominantly  interconnect,  are  reduced  in 
order  to  improve  circuit  performance.  Measurements  of  the  two  most  importance 
intrinsic  capacitances,  C  d  and  Cgs,  were  performed  on  state-of-the-art  0.24  |lm  effective- 
channel-length  nMOS  and  pMOS  devices.  Voltage  accelerated  stress  of  the  nMOS 
devices  via  channel-hot  electrons  showed  that  C  d  decreases  and  Cgs  increases  with  stress 
time,  whereas  the  pMOS  devices  saw  negligible  change.  Because  of  Miller  feedback, 
however,  the  nMOS  Cgd  reduction  dominates  the  Cgs  increase,  resulting  in  an  overall 
CMOS  load  reduction.  This  load  reduction  offsets  the  drop  in  the  drain  current,  both  of 
which  are  caused  the  the  degradation  mechanism:  interface  traps.  A  model  was  given  to 
qualitatively  explain  the  Cgd  reduction  (and  Cgs  increase)  with  stress.  The  prestress  and 
post  stress  ID,  Cgd,  and  Cgs  data  were  fit  using  the  BSIM3  device  model  so  that  a  stressed 
circuit  could  be  simulated.  A  simulation  of  a  31-stage  ring  oscillator  verified  that  the 
decrease  in  the  capacitive  load  from  Cgd  reduction  partially  offset  the  ID  reduction, 
resulting  in  improved  simulated  performance  compared  to  simulations  only  taking  the  ID 
reduction  into  account  (which  is  the  standard  industry  practice).  Although  these 
simulations  ignored  interconnect  capacitance,  it  is  clear  that  the  current  ID-only  method  is 
conservative  and,  by  taking  Cgd  degradation  into  account,  the  design  guardbands  can  be 
loosened,  resulting  in  less  costly  redesign  and  scrap. 

This  dissertation  has  examined  several  of  the  current  and  future  problems 
associated  with  scaling  transistor  characteristics  and  presented  extended  models  and  new 
methodology  to  account  for  their  effects.  It  is  clear  that,  for  the  short  term,  1-D  models 
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can  be  used  to  model  the  most  important  short-channel/thin-oxide  deviations  from  simple 
theory.  There  is  no  reason  why  1  -D  models  cannot  be  used,  with  appropriate  extensions 
and  partitions,  until  transistors  are  replaced  with  a  completely  new  technology,  which  is 
unlikely  to  happen  in  the  next  twenty  years.  Even  as  the  1-D  models  become  less 
accurate,  they  will  always  retain  importance  for  providing  initial  guesses  for  more 
complete  2-  and  3-dimensional  models,  since  the  1-D  model  will  always  embody  a 
majority  of  the  first-order  device  effects. 


APPENDIX 
METAL-GATE  LFCV  MODEL  DERIVATION 

This  appendix  contains  a  complete  derivation  of  the  low-frequency, 

degenerate,  deionized,  Fermi-Dirac-based,  metal-gate  CV  model  [46],  as  used  as  a 

starting  point  for  the  polysilicon-gate  model  in  Chapter  3.  Similar  derivations  were  made 

by  Seiwatz  and  Green  [44]  and  Hunter  [45]. 

Basic  CV  Equations 

The  low-frequency  CV  (LFCV)  characteristics  of  a  metal-oxide- 
semiconductor  (MOS)  capacitor  are  fairly  straightforward.  Figure  A.  1  (A)  shows  that  the 
three  layers  which  comprise  the  MOS  name:  the  metal  (gate),  the  insulator  (Si02,  or 
oxide),  and  the  semiconductor  (substrate).  Chapter  3  shows  the  extension  of  this  model 
to  polysilicon  gates,  as  well  as  the  effects  of  the  polysilicon  gate  on  the  device 
characteristics  as  well  as  the  implications  on  device  performance. 

Ideally,  the  CV  curve  model  would  have  one  equation— the  gate  capacitance, 
Cg,  as  a  function  of  the  gate  voltage,  VG.  At  very  least,  it  would  be  good  to  have  a  set  of 
parametric  equations,  with  VG  and  Cg  as  functions  of  some  other  parameter  (such  as  the 
substrate  surface  potential,  V,x).  Due  to  the  complexities  of  the  mathematics,  the  latter  is 
the  best  that  can  be  done. 
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Gate  Potential 

Two  simple  equations  are  required:  one  for  the  gate  voltage,  and  one  for  the 
gate  capacitance.  Looking  again  at  Fig.  A.l  (A),  Kirkoff's  voltage  law  requires 

VG  =  VM  +  V0  +  VIX  (A.la) 

That  is,  the  voltage  applied  at  the  gate  must  be  equal  to  the  drops  across  the  metal,  the 
oxide,  and  the  semiconductor  respectively. 

Because  the  Fermi  level  of  the  metal  does  not  necessary  coincide  with  the 
Fermi  level  of  the  substrate  (which  is  controlled  by  the  dopant),  there  is  an  additional 
term,  <1>MS,  the  metal-to-semiconductor  work-function  difference, to  account  for  this 
offset.  This  is  best  visualized  on  a  band  diagram,  as  shown  in  Fig.  A.l  (B)  for  an  arbitrary 
positive  applied  voltage  with  a  p-substrate  MOS  capacitor  (MOSC). 

From  Figure  A.  1  (B),  the  voltage  drops  across  the  device  are  clearly: 

I'm   +   vo   =   Xs   -   vix   +    <Ec   "   Ei)/<3   +   vf   +   vg  (A-lb) 

where  Om  is  the  work-function  for  the  metal,  %s  's  me  electron  affinity  for  the  substrate, 
Ec  and  E,  are  the  conduction  band  edge  and  intrinsic  energies,  respectively,  and  VF  is  the 
Fermi  voltage,  which  is  equivalent  to  (E,  -  Fp)/q.  Collecting  these  terms  in  a  form  more 
like  the  previous  equation  gives 

V0  =  Vo  +  V,x  +  fMS  (A.lc) 

where  Oms  =  Om  -  Os  =  <J>M  -  (%s  +  1(Ec  ■  E|)  +  VF).  In  Eq.  A.la  a  voltage  drop  across 
the  metal  was  included,  which  was  purposely  neglected  in  the  band  diagram  and,  hence, 
in  Eq.  A.  lb.  Ideally,  the  voltage  drop  across  the  gate  will  be  OV.  When  a  metal  is  used, 
the  drop  is  effectively  OV.  Polysilicon  gates  introduce  a  depletion  layer  which  causes 
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P'g-  A-1  MOS  capacitor  schematic  and  corresponding  energy-band  diagram. 

(A)  Schematic  diagram  of  a  MOS  capacitor  and  (B)  corresponding 
energy-band  diagram  depicting  the  potential  drops.  Shown  is  a 
positive  voltage  VG  applied  at  the  gate,  resulting  in  the  Si02/Si 
surface  entering  inversion. 
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a  voltage  drop,  as  well  as  extra  capacitance.  Chapter  3  deals  with  this  important  effect, 
while  in  this  appendix  it  will  simply  be  assumed  that  the  metal  is  a  perfect  conductor; 
thus  VM=0V,  and  is  neglected  in  Eq.  A.lc. 

The  drop  across  the  insulator,  V0  (where  the  'O'  is  for  oxide,  since  Si02  is  the 
prevalent  insulator  for  silicon  devices)  will  be  found  in  the  next  section. 

The  potential  across  the  semiconductor,  the  surface  potential  VIX,  cannot  be 
easily  measured  or  found.  This  will  be  the  unknown  variable  which  relates  (A.1C)  and 
Cg  formula.  From  a  strictly  mathematical  point  of  view,  V,x  (or  the  normalized 
equivalent,  U,x)  is  the  parametric  variable  for  the  two  equations  (VG  and  C„). 

Oxide  Potential 

Charge  neutrality  guarantees  that  the  sum  of  the  charges  through  the  circuit  in 
Fig  A.l  (A)  is  zero.  Thus, 

QG  +  Q0T  +  Qit  +  Qs  =  0-  (A.2) 

QG,  the  gate  charge,  is  equal  to  the  charge  at  the  gate/insulator  interface  (by  Gauss's 
theorem),  so  QG  =  le0E0l  =  e0VD/T0X  =  C0V0.  Similarly,  Qs  =  -£SEIX  at  the 
insulator/silicon  interface,  where  E0  is  the  electric  field  at  the  gate/oxide  interface,  EIX  is 
the  electric  field  at  the  oxide/substrate  interface,  Tox  is  the  oxide  thickness,  and  C0  is  the 
constant  insulator  (oxide)  capacitance.  Q0T  and  Q1T  represent  the  trapped  charge  and 
interface  charge,  respectively,  and  e0  is  the  dielectric  constant  of  the  insulator. 

Substituting  the  above  two  relationships  into  (A.2)  and  solving  for  the  oxide 
(insulator)  potential  gives 

(Qot  +  Qit>/C0.  (A.3) 
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With  this  relation,  (A.  lc)  can  be  rewritten  as 

VG  =      [*MS     "      <Q0T     +     QlT>  /C0l      +     VIX     +     £SEIX/C0  (A.4) 

=  vfb  +  vix  +   esEix/co.  (A.5) 

where  VFB,  the  flat-band  voltage,  is  given  by 

VFB     =     *MS     "      <Q0T     +     QlT>/C0.  (A.6) 

The  terms  of  (A.5)  are  almost  all  known.  £s  can  be  found  in  a  handbook.  C0  is  a 
constant  (at  a  constant  temperature),  and,  in  the  case  of  deriving  this  formula,  can  be 
assumed  known.  V[x  is  the  parametric  variable  discussed  earlier,  so  there  is  only  one 
unknown  variable:  E[x,  the  field  across  the  semiconductor.  Before  this  problem  is 
rectified,  the  capacitance  aspect  of  the  CV  derivation  will  be  investigated. 

Gate  Capacitance 

Looking  again  at  Fig.  A.  1  (A),  the  gate  capacitance,  Cg,  seems  almost  trivial. 
It  is  simply  the  serial  combination  of  the  metal  (gate-contact)  capacitance  (assumed  to  be 
infinite  since  a  conductor  has  no  space  charge  layer  width),  the  insulator  capacitance 
(C0),  and  the  semiconductor  capacitance  (Cjx),  as  shown  in  Fig.  A.  IB.  The  'ix'  subscript 
designates  the  band  bending  from  the  interface  (i)  to  the  substrate  (x),  which  is  an 
important  designation  when  polysilicon  gates  are  used.  Hence,  the  gate  capacitance  is 
1/Cg  =  1/C0  +  1/Cix  (serial  capacitance  summation) 
Cg   =   CixC0/(Cix   +   C0).  (A.7) 

The  capacitive  contribution  from  a  polysilicon-gate  space  charge  layer  adds  another  serial 
term,  and  is  discussed  in  detail  in  Chapter  3. 
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Again  there  is  only  one  unknown:  the  semiconductor  space-charge 
capacitance.  Equations  for  Cix  and  E,x  will  be  derived  in  a  following  section,  but  first  a 
formulae  for  electron  and  hole  concern 
functions  of  the  carrier  concentrations. 

Semiconductor  Carrier  Concentration  Formulae 

Easily  variable  carrier  concentration  is  what  differentiates  a  semiconductor 
from  an  insulator  or  a  conductor.  In  this  section,  the  electron  and  hole  concentrations  will 
be  derived  using  Fermi-Dirac  (degenerate)  "statistics." 

To  find  the  carrier  concentration  in  a  semiconductor  as  a  function  of  energy, 
two  things  are  needed:  the  three-dimensional  density  of  states,  D3(E),  and  the  distribution 
function,  /(E).  The  relationship  between  the  density  of  states  and  energy  is 
approximately  parabolic  near  the  bottom  (in  E-k  space)  of  the  conduction  band.  The 
conduction  electrons  will  tend  to  be  near  this  minima,  allowing  us  to  use  the  following 
parabolic  formula:  [43] 


D3(E)  =  [4JT(2m*/h2)3/2]-J  (E   -   Ec)dE.  (A.8) 

The  well-known  Fermi-Dirac  occupation  function  is  given  by 

/(E)  =  {1    +    expKEp-El/kT]}-1,  (A.9) 

where  EF  is  the  Fermi  level.  Examination  of  (A.9)  shows  that  /(E=EF)=0.5.  Thus,  the 
Fermi  level  is  the  energy  where  half  of  the  total  electrons  are  contained  in  the  levels 
below  EF.  A  formula  for  the  Fermi  level  is  given  later  [(A.37)  and  (A.38)]. 
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Electron  Concentration 


By  integrating  the  product  of  (A. 8)  and  (A.9)  over  the  energy  from  the  bottom 
of  the  conduction  band  (Ec)  to  free  space  (EVL  [vacuum  level]),  the  number  of 
conduction  electrons  can  be  found. 


/  (E)D(E)dE 


(A.  10) 


47T 


t2me 

h2 


IE    -    Er 


Ec      {1    +    exp[  (E    -    EF)  /kT]  } 


(A.ll) 


Substituting  e=(E  -  Ec)/kT  (=>  dE=kTde),  n=(EF  -  Ec)/kT,  and  noting  that  eVL  »  1, 
(which  means  the  upper  integral  range  can  be  approximated  by  infinity),  results  in 


n   =    47T 


2m„kT    ,3/2 
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{1   +   exp[e   -   n]  } 


■  Ncyl/2(H)  .(A.14) 
Nc,  known  as  the  effective  density  of  conduction  band  states,  is  given  by 

r2;rirukT 


%_h  ( f) )  is  the  Fermi-Dirac  integral  of  the  lh  order,  which  is  shorthand  for 
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(A.  15a) 
(A.  15b) 

(A.  16) 
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Why  is  there  a  2l\  n  term  in  front  of  the  integral?  This  notation  of  the  Fermi-Dirac  (FD) 
integral,  proposed  by  Dingle  [112],  is  from  a  family  of  FD  integrals  of  the  form 


1 


*f<n)  = 


£:    d£ 


.  (A.17) 

0    { 1    +    exp  [  £    -    rj  ]  } 


r(j+i) 

Thus,  the  2/^  n  term  comes  from  r(1.5)"'.   The  Dingle  notation  has  a  number  of 

beneficial  properties  compared  with  the  other  FD  notational  family,  called  Sommerfeld 

notation,  which  differs  only  by  the  factor  of  r(j+l)"'.  The  most  useful  (Dingle-notation) 

property,  when  it  comes  to  device  physics,  is 

d 

— £j(/J)    =    fi-iW,  (A.18) 

dn 

which  makes  differentiating  and  integrating  FD  integrals  quite  simple,  as  will  be  seen 

later.  When  this  Dingle  notation  is  used,  working  with  FD  integrals  becomes  almost  as 

easy  as  using  exponentials  (i.e.,  the  outcome  of  using  the  Boltzmann  distribution  function 

instead  of  the  Fermi-Dirac  distribution  function  when  deriving  the  carrier  concentration). 

When  Sommerfeld  notation  is  used  instead  of  Dingle,  differentiation  results  in  a 

r(j+l)/r(j)  multiplicative  term  in  (A.18). 

Hole  Concentration 

The  derivation  for  the  hole  concentration  is  identical  to  that  of  the  electron 
concentration,  except  that  the  density  of  states  equation  is  referenced  from  the  valence 
band  and  uses  the  effective  hole  mass.  The  occupation  function  for  holes  is  ( 1  -  /(E)}, 
or,  in  other  words,  the  holes  are  where  the  electrons  are  not. 


D3(E)    =    [47T(2mh/h2)3/2H  (Ev   -   E)dE  (A.  19) 
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P 


Ev 


{1-    f(E)}D(E)dE 


(A.20) 


Ev  and  Ev'  are  the  top  and  bottom  of  the  valance  band,  respectively.  Using  the  same 
methods  as  above,  and  making  the  same  wide-band  approximation,  the  following 
equation  results: 

2      f"  e''/!  d£ 

(A.22) 


p  =   2 


(A.21) 


0    { 1   +   exp  [  E '    -   7} '  ]  } 
■  NvjI/2(rr)  . 

£  '  is  (Ev  -  E)/kT,  and  n '  is  (Ev  -  EF)/kT  (=  (Eg/kT  -  n)}.  This  is  a  different,  although 
equivalent,  presentation  than  others  have  use  [typically,  p  =  Nvy1/j  (-r/-£G)],  and  better 
shows  the  symmetry  of  the  electron  and  hole  distributions.  The  Nv  term,  known  as  the 
effective  density  of  valance  band  states,  is  given  by 


N.„ 


(A.23a) 


=    2.51xl019(mh/m)3/2(T/300)3/2   cm"3. 


(A.23b) 

Comparing  (A.23b)  to  (A.  15b)  shows  that  the  difference  between  the  density  of  states  for 
electrons  in  the  valance  band  and  the  density  of  states  for  holes  in  the  conduction  band 
stems  from  the  difference  in  the  effective  masses  for  electrons  and  holes. 


Semiconductor  Relations 


Formulae  for  the  electron  and  hole  concentration  were  derived  so  that  an 
equation  for  the  field  and  capacitance  across  the  semiconducting  material  could  be  found. 


127 

Both  of  these  equations  will  be  functions  of  the  carrier  concentrations,  which,  in  turn,  will 
be  functions  of  the  potential  across  the  semiconductor,  VIX  (surface  potential). 

Charge  Density 

The  charge  density  in  the  semiconductor  is  given  by  the  equation 

p    =   q(-N   +    P   -   NA   +    PD   -   nT)  .  (A.24) 

The  N  and  P  terms  are  as  given  above  (A.  14  and  A. 22),  and  the  NA  and  PD  terms  are  the 
ionized  acceptors  and  donors,  respectively.  The  nT  term  represents  the  contribution  from 
trapped  charge. 

Generally,  it  is  assumed  that  all  of  the  impurities  are  completely  ionized  in 
doped  silicon  because  shallow  level  impurities  are  used.  However,  at  low  temperatures 
and/or  high  doping  (>  1018  cm"3),  incomplete  impurity  ionization  occurs,  and  the 
approximation  is  no  longer  valid.  For  deep  level  impurities,  deionization  will  become 
significant  even  at  moderate  doping  and  room  temperature. 

For  p-type  impurities,  the  ratio  of  empty  acceptors  (NA)  to  filled  acceptors 

(K) is 

NA/N°    =    (l/gA)exp([EF   -   EA]  /kT)  ,  (A.25) 

where  gA  is  the  degeneracy  factor  and  EA  is  the  acceptor  energy  level.  Noting  that  NAA  = 
NA  +  NA,  (A.25)  can  be  transformed  into 

Naa 
.  (A.26) 


1   +   gAexp(  [EA  -   EF]/kT) 
A  similar  equation  can  be  derived  for  n-type  impurities,  and  is  given  by 
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NDD 

ND  =  N*D  =  .  (A.27) 

1   +   gDexp(  [EF   -   ED]/kT) 

These  two  equations  take  deionizalion  into  account.   Generally,  it  is  assumed  that 

impurities  are  completely  ionized,  which,  in  p-type  material,  implies  NA  =  NAA.  This  is  a 

good  approximation  when  T  is  large  or  EF  »  EA.    At  very  low  temperatures, 

gAexp([EA-EF]/kT)  will  not  be  significantly  less  than  1 ,  and  deionization  will  occur  even 

when  the  Fermi  level  is  above  the  impurity  level.  For  very  high  doping  with  even  a 

shallow-level  acceptor,  EF  will  still  lie  below  EA,  causing  deionization.  For  deep-level 

donors,  EF  can  easily  fall  below  EA. 

Of  course,  this  is  mathematically  apparent,  but  it  is  also  physically  intuitive. 
At  very  low  temperatures,  there  will  not  be  enough  thermal  energy  to  ionize  the 
impurities,  so  deionization  is  expected.  At  high  impurity  concentrations,  the  dopant 
becomes  a  significant  part  of  the  composition  and  the  impurity  level  becomes  a  non- 
negligible  part  of  the  band  structure  (also,  the  energy  gap  narrows,  but  that  is  a 
completely  different  problem).  Impurity  banding  can  also  occur,  but  since  an  underlying 
assumption  of  these  derivations  is  uniform  doping,  impurity  banding  will  be  neglected. 

Assuming  that  the  semiconductor  has  negligible  trapping,  then  (A.24)  (charge 
density)  becomes 

p    =   q(P   -   N   -   NA   +    PD)  (A.28) 

Semiconductor  Electric  Field 

With  (A.28),  carrier  concentration  formulae  (A.  14)  and  (A. 22),  and  ionized 
impurity  concentrations  (A. 26)  and  (A.27),  the  charge  density  can  be  written  as  a  function 
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of  energy.  Using  this,  an  equation  for  the  semiconductor  field,  E|X,  can  be  found  via 
Poisson's  equation.  Starting  from  the  d.c.  steady-state  Poisson  equation  in  one 
dimension, 

£sdE/dx  =   -p,  (A.29) 

where  £s  is  the  dielectric  'constant'  of  silicon,  E  is  the  electric  field,  and  p  is  as  given  in 
(A.28).  Integrating  by  quadrature,  noting  E=-(dV/dx),  gives  [43] 
£sdE/dx  =   -£s(d/dx) (dv/dx) 

=   -es[(dv/dx) (d/dV)] (dv/dx) 

=   -(£s/2) (d/dV) (dV/dx)2 

=   -(£s/2) (dE2/dV) . 
Thus,  from  (A.29)  and  (A.30) 

dE2    =    (2/£s)pdV 

Each  side  of  the  equation  can  be  integrated  from  the  surface  to  zero  (E,x  to  0  for  the  field 
term  and  UIX  to  0  for  the  charge  density  term),  noting  that  V=(kT/q)U  [and 
dV=(kT/q)dU]. 


(A.30) 
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1  +  gDexp(UD   -   UF   +   UIX) 


1  +  gDexp(UD   -   UF) 


.(A.32) 
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Use  was  made  of  (A.  18)  to  integrate  the  FD  functions.  One  might  wonder  how  the  UIX 
term  ended  up  in  (A. 32)  when  there  was  no  free-variable  U  (or  V)  in  the  original 
equations  for  the  carrier  concentrations  or  the  ionized  impurities.  The  surface  potential, 
UIX>  represents  the  amount  of  additional  band  bending  of  the  silicon  band  at  the  Si/Si02 
interface  caused  by  the  applied  field.  It  was  zero  in  the  equilibrium,  zero-field  state,  and 
not  included  in  the  equations. 

Semiconductor  Capacitance 


Finally  the  semiconductor  capacitance,  Cix,  is  needed.  For  low-frequency,  this 
is  given  by 

Clx  =   -P(UIX)/EIX.  (A.33) 

The  low  frequency  assumption  means  that  the  minority  carriers  can  be  generated  quickly 

enough  to  follow  the  small-signal  voltage.  For  transistors  (MOSTs),  LFCV  curves  result 

even  when  high  (1MHz)  frequencies  are  used  in  CV  measurements  because  the  (highly 

doped)  source  and  drain  will  supply  the  necessary  minority  carriers  to  follow  the  a.c. 

signal,  as  long  as  the  channel  is  short  enough.  Substituting  (A.  14),  (A. 22),  (A.26),  and 

(A.27)  into  (A.28n)  or  (A.28p),  and  then  placing  the  result  in  (A.33),  gives 

q 

-Nv^y^-Ujx-Uv+Up)    +  NCJ1/S(UIX+UC-UF) 


1  +  gAexp(UF   -  UA  -   UIX) 


1  +  gDexp(UD 


UF  +   UIX) 


(A.34) 
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A  degenerate,  deionized  LF  CV  curve  can  now  be  generated  by  substituting  (A.32)  into 
(A. 5)  [the  gate  voltage  formula]  and  (A. 34)  into  (A. 7)  [the  gate  capacitance  formula], 
where  VG  and  C„  are  generated  in  pairs  as  a  function  of  VIX  (or  U|X). 

Fermi  Level 

Throughout  this  derivation,  it  has  been  assumed  that  the  Fermi  level  is  known. 
Considering  the  degenerate,  ionized,  p-type  case  first,  (A. 22)  give,  assuming  NAA»nj, 

NAA    =   Nv5i/2([EV   -    EF]/kT)  .  (A.35) 

For  the  intrinsic  case 

«i    =   Nv^/.ltEv   -   EJ/kT).  (A.36) 

Solving  these  two  cases  for  the  Fermi  level  relative  to  the  intrinsic  level  gives 

EF  -  Ej.     =  kT[£7»1<»VNv)    "   7~h   (Naa/Nv)  ]  ,  (A.37) 

where  T'^1  is  the  inverse  Fermi-Dirac  integral  of  the  '/s  order.  The  inverse  Fermi-Dirac 
integral  is  analogous  to  a  natural  logarithm  in  the  Boltzmann  regime.  It  is  important  to 
note  that,  unlike  logarithms,  ^v/fX/Y)  *  7i~/2l(X)  -  &7,1  (Y) .  Furthermore, 
(A.37)  can  be  roughly  approximated  by  kTJ,^1  (nt/NA) ,  but  not  -kT^1  (%/llj) . 

Analogous  to  p-type,  the  Fermi  level  the  n-type,  degenerate  (Fermi-Dirac), 
fully-ionized  case  is  given  by 

EF  -  Bj     =  kTl*;,1  (Nub/No)    -    <f~^  (ni/Nc)  ]  .  (A.38) 

In  the  deionized  region,  there  is  no  analytical  formula  for  EF  (or  the  normalized  UF)--EF 
must  be  found  iteratively. 
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